Meta Is Training Its AI on the Bible and Other Religious Texts

11个月前
标签:比特币0620
文章来源: Decrypt

The parent company of Facebook and Instagram says it has developed an AI-powered text-to-speech technology that can identify over 4,000 languages. The goal, Meta says, is to preserve the world's languages, and the tech giant is using the Bible and other religious texts to do it.


"Collecting audio data for thousands of languages was our first challenge because the largest existing speech datasets cover 100 languages at most," Meta said in a post announcing the project. "To overcome this, we turned to religious texts, such as the Bible, that have been translated in many different languages and whose translations have been widely studied for text-based language translation research."


In an accompanying research paper by the Meta AI core team, the company says it obtained its data from the Bible, including original text and audio recordings from FaithComesByHearing.com, GoTo.Bible, and Bible.com.


The project includes recordings of Bible stories, evangelistic messages, scripture readings, and songs in more than 6,255 languages and dialects. While most recordings feature were often by male readers, the Meta says its models work equally well for female voices.


A dataset of readings of the New Testament, Meta said, provided more than 1,100 languages that provided 32 hours of data per language on average.


According to Broward College's Lingua Language Center, there are over 7,100 living languages worldwide.


"Our consultations with Christian ethicists concluded that most Christians would not regard the New Testament, and translations thereof, as too sacred to be used in machine learning," the Meta AI team said, adding that the same is not true for all religious texts.


“There is also the risk of religious training data biasing the models with respect to a particular world view,” Meta AI said. “However, our analysis of the language generated by our models suggests that the language produced by the resulting speech recognition models exhibit only little bias compared to baseline models trained on other domains.”


After its metaverse ambitions fizzled earlier this year, Meta appears to have shifted its focus to artificial intelligence, including building an AI tool to identify and separate items in pictures and an AI-powered tool to help brands target users on its Facebook and Instagram platforms.


While the technology is still in its early stages, Meta says it is open-sourcing its data and code so that others can build on, develop, and improve the platform.


"Many of the world’s languages are in danger of disappearing, and the limitations of current speech recognition and generation technology will only accelerate this trend," Meta said. "We want to make it easier for people to access information and use devices in their preferred language, and today we’re announcing a series of artificial intelligence models that could help them do just that."


免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

评论

暂时没有评论,赶紧抢沙发吧!