As a mainstream narrative of American life paints a bleak picture of misinformed internet proponents clashing over a landscape stripped of anything resembling truth or reality, a new study from Georgia Tech School of Public Policy offers a different approach while advancing the use of machine learning in the social sciences and an understanding of the importance of open access scientific information to everyday Americans.
The study, published on February 23, 2022, in the prestigious Proceedings of the National Academy of Sciences (PNAS)analyzed the reasons behind 1.6 million downloads of consensus reports from the National Academies of Sciences, Engineering, and Medicine (NASEM), considered one of the most credible scientific publications.
The resulting analysis, which only includes downloads in the United States, is the first to look at who is using this information and why. Professor Diana Hicks, Assistant Professor Omar I. Asensio and Ph.D. students Matteo Zullo and Ameet Doshi, all of Georgia Tech’s School of Public Policy, co-authored the study.
They found that while nearly half of the reports were downloaded for academic purposes, even more were viewed by people outside strictly educational settings, such as veterans, chaplains and writers. The word “edification” appeared 3,700 times in the dataset, signaling a strong desire for lifelong learning among users.
“This study shows a strong demand among ordinary Americans for the highest quality information to help improve the work they do, to help their loved ones, neighbors and communities, and in some cases simply to learn to learn,” Hicks said. “We never hear these stories because everyone’s focused on all the misinformation circulating on social media.”
The study makes it clear that open access to scientific information is important to the average American, said co-author Ameet Doshi, who holds a Ph.D. from the School of Public Policy. student and director of the Donald E. Stokes Library at Princeton University.
“This research will hopefully raise awareness of the positive returns that accrue to society from investments in institutions that democratize public access to high-quality research,” said Doshi, who for his thesis analyzes similar data on downloads from the open-access portal.
Machine learning is becoming a key tool for social scientists
The study looked at 1.6 million comments left on 6.6 million downloads of NASEM consensus reports since 2011, when academies began offering them for free. The comments were left in response to a prompt asking users how they plan to use the reports.
The authors used a machine learning algorithm called BERT to analyze comments – a new social science extension of Asensio’s use of machine learning techniques, which organize and make sense of unstructured data that takes too much enough time for people to analyze them directly. Asensio’s work has shown that such data can be of immense use to researchers and policy makers when researchers properly teach algorithms how to do the hard work. Asensio’s Data Science and Policy Lab has used deep learning techniques in recent years to advance knowledge in energy efficiency, sustainable plastics and electric vehicle charging infrastructure.
“When you get data at this scale, especially when you have unstructured data that grows in real time, there are practical limitations as to why this kind of behavioral insights weren’t known before” , said Asensio. “We show in a number of research areas that experimental approaches to curating human-labeled training data can improve the performance of popular supervised ML algorithms, to a level that can match or even exceed human performance. This expands the possibilities for data discovery in the social sciences, so there was a compelling need to use these computational solutions to automatically classify behavioral evidence on public interest in scientific information.
The analysis found that academic users accounted for 48% of commented downloads, an unsurprising result given the nature of the reports, which are densely scientific and primarily intended to serve the technical needs of federal agencies.
Learn to learn
It’s the other uses that have interested researchers the most, including downloads from amateur radio operators, amateur astronomers, lifelong learning providers, and retirees interested in keeping up.
About 150,000 downloads were categorized as having to do with “personal use,” including topics such as cannabis, death, genetically modified crops, evolution versus creationism, and gun violence reduction. The analysis also revealed that thousands of veterans plan to use NASEM reports as part of their disability application with the United States Veterans Administration, the 20 NASEM reports on the ‘Agent Orange’, health effects from burning stoves or high noise levels being the most frequent uploads.
Over 25,000 doctors and nurses have downloaded reports with plans to use the details to improve their clinical work.
One user downloaded 551 reports for “personal edification,” according to the study.
The researchers also noted uploads by nonfiction authors, science fiction writers, and even visual artists. The reports have even been discussed in book clubs, according to comments.
The algorithm used by the researchers identified the correct meaning about 84% of the time.
“There’s a performative aspect to language, there’s a descriptive aspect,” Zullo said. “The fact that a machine learning tool can predict meaning with this kind of accuracy is amazing.”
Americans “innately curious”
Overall, the results indicate a broad and impactful dissemination of knowledge stemming from NASEM’s decision to make the reports freely available, the authors wrote.
“Findings reveal adults who are motivated to seek out the most credible sources, engage with engaging content, use it to improve the services they provide, and learn more about the world they live in.” they stated. “The image stands in stark contrast to the mainstream narrative of an uninformed and manipulated public targeted by social media.”
That’s not to say misinformation on social media isn’t a problem, the authors note. Social media platforms are teeming with millions of false and misleading posts, many of which are posted by bots, which can contribute to belief in conspiracy theories, misinformation, and even state-sponsored disinformation.
However, in this case, the study shows an audience that – despite the widespread narrative lamenting the politicization of science and mistrust of scientists – still turns to experts to help unravel a complicated and ever-changing world. evolution.
“A lot of the American public is naturally curious and willing to tackle academic jargon to gain insight,” Doshi said. “That in itself is a comforting discovery.”
The article, “Wide Use of National Academy Consensus Reports by the American Public,” is available at https://doi.org/10.1073/pnas.2107760119.
About Georgia Tech
The Georgia Institute of Technology, or Georgia Tech, is a top 10 public research university developing leaders who advance technology and improve the human condition.
The Institute offers business, computer science, design, engineering, liberal arts, and Sciences degrees. Its nearly 44,000 students, representing 50 states and 149 countries, study at the main campus in Atlanta, at campuses in France and China, and through distance and online learning.
As a leading technology university, Georgia Tech is an engine of economic development for Georgia, the Southeast, and the country, conducting more than $1 billion in research annually for government, industry, and the society.
Proceedings of the National Academy of Sciences
The title of the article
Widespread use of National Academies consensus reports by the American public