Initially used to verify citations on Wikipedia, Facebook parent Meta has announced a new tool called ‘Sphere’, an AI built to tap into the vast repository of information on the open web to build a knowledge base.
To train the Sphere model, Wikipedia editors are using a new AI-based language translation tool, and Meta created “a new data set (WAFER) of 4 million Wikipedia citations, significantly more intricate than ever used for this sort of research.”
The Meta research team has open-sourced Sphere, which is currently based on 134 million public web pages, split into 906 million passages of 100 tokens each, in order to form a solid foundation for AI training models and give users the ability to shape it for different applications.
Running 6.5 million entries, an average of 17,000 articles per month, the process of adding and editing articles on Wikipedia is crowdsourced. To this end, Wikimedia, the Foundation which oversees Wikipedia, has been thinking of ways to harness that data.
Using Sphere to automatically scan hundreds of thousands of citations simultaneously, the AI will spot a citation that lacks support. Meta explains: “If a citation seems irrelevant, our model will suggest a more applicable source, even pointing to the specific passage that supports the claim.”
In a blog post, Meta noted: “Because Sphere can access far more public information than today’s standard models, it could provide useful information that they cannot.”
Last month, Wikimedia announced an Enterprise tier and its first two commercial customers, Google, and the internet archive. Although Sphere doesn’t reference Wikimedia Enterprise, customers will want to know that content is verified and accurate before they consider paying for the service. As for this agreement, Meta confirmed that there are no financial terms to this deal.
“Our next step is to train models to assess the quality of retrieved documents, detect potential contradictions, prioritize more trustworthy sources — and, if no convincing evidence exists, concede that they, like us, can still be stumped,” Meta noted.
Meta has confirmed that it is not using Sphere on its own platforms like Facebook, Instagram, and Messenger to fight misinformation, but rather has its own tools to manage and moderate content.
If you see something out of place or would like to contribute to this story, check out our Ethics and Policy section.