Please see the README file for more information.
The release contains the following:
ORIGINAL SOURCE OF DATA
The following TED Talks were downloaded from WIT3: They form the test set of the IWSLT13 Shared Task dataset.
767 - Bill Gates on Energy: Innovating to Zero! 769 - Aimee Mullins: The Opportunity of Adversity 779 - Daniel Kahneman: The Riddle of Experience vs. Memory 783 - Gary Flake: Is Pivot a Turning Point for Web Exploration? 785 - James Cameron: Before Avatar ... a Curious Boy 790 - Dan Barber: How I Fell in Love With a Fish 792 - Eric Mead: The Magic of the Placebo 799 - Jane McGonigal: Gaming Can Make a Better World 805 - Robert Gupta: Music is Medicine, Music is Sanity 824 - Michael Specter: The Danger of Science Denial 837 - Tom Wujec: Build a Tower, Build a Team
The following EU Bookshop documents were downloaded from the EU Bookshop online archive in E-Book format: The raw text was extracted using the Calibre E-Book Management tool
KEBC11002 - Social Dialogue KEBC12001 - Demography, Active Ageing and Pensions KH7911105 - Soil MI3112464 - Road Transport MJ3011331 - Energy NA3211776 - Europe in 12 Lessons QE3011322 - Shaping Europe QE3211790 - Active citizenship