Advancing Human Language Technology for International Literacy and Development :: with Daniel Wilson, PhD
In this episode of Ventures, my guest Daniel Wilson (https://www.linkedin.com/in/daniel-wilson-phd-46956b35) and I discuss his work with International Literacy and Development (ILAD, https://ilad.ngo/), specifically in the area of human language technology. We talk about the landscape of low-resourced languages that are unfortunately going extinct, why it’s important to preserve these language communities, and how technology can be used to develop languages and promote literacy. We also tee-up how Web3 will be able to accelerate literacy and development work worldwide, which talk about in more depth in Part 2 (which will be released next week).
1:59 - Tee up for this episode, why “solving problems” has a lot more weight and importance for venture builders and investors in the developing world.
2:40 - Daniel’s background in compassionate innovation, human language technology, and ILAD.
3:43 - What is ILAD? (International Literacy and Development, more details)
4:42 - More about Daniel’s background and why he got into linguistics.
6:18 - Quick introduction to linguistics. Different types, approaches, etc…
8:14 - How do linguistics define what a language is vs. a dialect?
9:20 - Many languages are spoken by a small number of people and are going extinct. Where in the world are these languages? What kind of numbers are we talking about? What’s the life situation of these people? (Correction here from Daniel, the minority of languages are spoken-only, just over 3k of them: he incorrectly said “majority” in the show)
13:25 - Richness of preserving languages around the world. The importance of helping with language development and literacy.
15:03 - How many people around the world are in low-resourced language communities? Example from Arabic speakers / dialects.
17:11 - Human language technology (HLT) - what is Daniel working on to help humans flourish via HLT?
22:34 - What is the difference between a low-resource and a medium-resource language? How are these defined?
23:26 - Goal will be to get low-resourced languages to have 500k+ parallel sentences with other languages, then they can be used to train a model.
23:52 - How many high-resource languages are out there? (Correction: the actual number is around 20, Daniel said “less than 200”, so he was shooting high on that one off the top of his head)
25:03 - From a market understanding lens, what is the future of HLT?
26:54 - How many of the lower-resourced communities have access to power/electricity?
29:00 - Anything else that Daniel would add from a foundational perspective on this topic? Automated payments w/ cryptocurrencies - we’ll dive into this a ton more next week in Part 2).
32:08 - Where can people find Daneil and ILAD online? Twitter: @danwils (academic) @geoframeai (web3 & smart stream) // Also: @_smartstream_ // LinkedIn: https://www.linkedin.com/in/daniel-wilson-phd-46956b35/ // email: firstname.lastname@example.org