Archivists Want AI to Help Save, Analyze Everything Trump Says

By Carl Engelking | January 26, 2017 1:20 pm
shutterstock_353116925

(Credit: Joseph Sohm/Shutterstock)

A week hasn’t even passed since the inauguration, but television news is saturated with the flurry of activity from President Donald Trump’s administration. Trump, via Twitter, promised to launch an investigation into illegal voting and threatened to “send in the Feds” if Chicago police can’t fix the “carnage.” And that was just between Tuesday and Wednesday.

This heightened scrutiny compelled the Internet Archive, a repository of everything posted on the web, to launch its Trump Archive in early January. You, perhaps, digitally time-traveled with the Internet Archive’s Wayback Machine, or checked out free books, movies and software. The Trump Archive, which draws content from The Internet Archive’s TV News Archive, includes more than 520 hours of televised Trump speeches, interviews, debates and other broadcasts tracing back to 2009. It will continue to grow.

“There’s no accessible library of television news, so television ends up washing over us like a wave,” says Roger Macdonald, director of the Internet Archive’s TV News Archive.

The TV News Archive gives journalists, scholars and citizens a chance to breathe, reflect and process that television news whitecap after it crashes ashore. And in the case of the Trump Archive, it’s a tool to track Trump’s statements on public policy issues, and ensure footage doesn’t succumb to the temporal nature of the Internet.

Already, Anna Wiener used the archive to immerse herself in Trump TV for a piece in The New Yorker, and German Chancellor, and physicist by training, Angela Merkel is reportedly poring over archived Trump interviews to get a read on the new Commander in Chief.

So the Trump Archive is already serving its purpose, but for the archive’s curators, it’s only a framework for their larger vision. These archivists want artificial intelligence to play a deeper role easing access to the statements of our elected officials in the archive, and in turn enhance accountability.

“Here’s a really clear public interest value for artificial intelligence,” says Macdonald. “We envision this as a multi-year project to model how machine intelligence could make media more accessible and interpretable, both by humans and machines.”

Going Deeper

Currently, closed captioning text is the data thread that ties the TV News Archive —1.3 million shows gathered since 2009 — together. A search on the Trump Archive, therefore, is a search for keywords in captions. This hack makes broadcast news videos searchable.

But closed captioning has its limits — try counting the errors in a live broadcast — and that’s where AI factors in. Beyond text, Macdonald and the archive team want to set loose facial recognition, voice identification, and other deep learning tools to put every second of video in context.

“We want to be able to extract novel metadata around our video collections: Who is talking, when, and what type of program is it?” says Dan Schultz, senior creative technologist at the TV News Archive. “Even conducting sentiment analysis is all within that scope of collecting novel metadata.” Sentiment analysis, quite simply, uses word choice and tone to assess whether a person’s language was, for example, negative or positive.

These algorithms will be key for journalists and curious citizens alike to interrogate the data with pointed questions (How has Trump’s language regarding the economy shifted in the past 6 months?) rather than more general inquiries, and get relevant answers in return. And, in a time when partisan battles over what’s “fake news” are being waged, AI will make it even easier to cut through the clutter.

Seeing and Believing

Artificial intelligence programs already excel at extracting information from text and images. Facebook’s facial recognition software can identify you and your friends, algorithms can automatically caption photos and researchers routinely perform sentiment analysis using Twitter data. Video, however, is a more difficult nut to crack, but the nut is indeed cracking.

Twitter’s artificial intelligence team, known as Cortex, developed an algorithm that can recognize what’s happening in a live video feed — it can tell if you’re playing a guitar or petting a cat, according to the MIT Technology Review. However, processing video, intuitively, is far more computationally heavy than text or images, and that’s what makes the task difficult.

Comcast recently acquired a company called Watchwith, which built a system that automatically generates metadata for videos using computer vision and machine learning. Google uses speech recognition to automatically generate closed captioning for videos.

Netflix and Hulu have also invested in deep learning and computer vision methods to generate video metadata to improve personal recommendations. Other companies like Clarifai, Viisights and Movida’s Deeva API rely on AI to perform similar services.

In all of these efforts, the end goal is to make videos easier to find in a digital world. Still, there’s a ways to go. “I have become fairly (skeptical) about the effectiveness of AI methods having seen so few deliver on their promise, however, it is essential to keep an open mind,” Digital Asset Management News editor Ralph Windsor wrote. For Windsor, AI still has a lot to prove before professional archivists can rely upon the technology.

Expanding the Archive

For the TV News Archive team, Trump was first in line, and in the near future they plan to expand their archival efforts to majority and minority leaders in the House of Representatives and the Senate. And, yes, they will also be archiving the digital footprint from the Obama administration.

“It is worth noting that eight years ago we didn’t have the pipelines to technology to expose this sort of thing,” Schultz said when asked why they started with Trump. “It’s sort of a perfect storm of interest, and technical timing and it aligned with the general mission of the archives.”

In addition to saving video for posterity purposes, the archive also serves as a vehicle for creative expression. For example, the TV News Archive team incorporated a tool, called Popcorn, which allows anyone to piece together video compilations of the news in their browser, without dishing out several hundred dollars for editing software.

“We’re very curious to see what will happen with it. We can’t even imagine how people will use our stuff,” says Nancy Watzman, managing editor of the Television News Archive.

 

CATEGORIZED UNDER: Technology, Top Posts
ADVERTISEMENT
  • http://www.mazepath.com/uncleal/qz4.htm Uncle Al

    By limiting history today we can avoid denuding the Earth of trees and suffocating in contemptible MS/Ed tomorrow. “archiving the digital footprint from the Obama administration.” Wipe your shoes after stepping in it.

    A well-printed and stored black and white photograph will last centuries, even deacidified newsprint. Digital media – and the technologies to read them – last about 30 years. Can you read IBM 729 tape, DECtape? An 8-inch floppy, 5.25-inch floppy, 3.5-inch floppy? ZIP drive? 12-inch LaserDisk? BetaMax tape? Massive archives of NASA satellite climate data on tape reels are conveniently unreadable.

    Baked clay tablets are user-qualified for millennia.

    • klear101

      After reading several of your posts, I think I have a better understanding of your desire to limit the curation of history. Many of the lessons already learned would not support the ideas, stances and bigotry reflected in your incessant trolling.

      • http://www.mazepath.com/uncleal/qz4.htm Uncle Al

        Factual, def: Contradicting the Left.

  • OWilson

    So there’ll be an editing tool, called Popcorn which will allow a user to compile cuts and edits of the archived videos of Donald Trump speeches..

    “We can’t even imagine how people will use our stuff,” says Nancy.

    Really Nancy? :)

  • John C

    And, yes, they will also be archiving the digital footprint from the Obama administration.

    ———–

    That thing is going to blow a fuse when they upload “if you like your health plan, you can keep your health plan.”

    • OWilson

      The Left believe all the unverified “golden shower’ stuff because they NEED to.

      You will never find Wolf, Anderson, or Tapper loudly dismissing such nonsense as, “Totally without evidence”.

      But suggest that there is election hanky panky, and they go into their demented rants!

      It doesn’t take a rocket scientist to figure out why! :)

      From cigarettes to the homeless for votes, to Black Panthers waving baseball bats around voting booths, it is public knowledge, but since nobody on the left is ever prosecuted, and investigations never started (thanks Obama, Holder, Lynch) it follows that there are few charged with illegal behavior.

      To the Left, rioting and looting can be “just blowing off steam”, “it’s only property” so you can see why voter fraud is not high on their agenda! :)

  • vlt5

    Will it determine the difference between their statements and actions? Maybe Bill and Hillary should be added to the list!

    • nancy_hall_91@mail.ru

      I profited 104,000 thousand dollars last year by freelancing online and I did that by work­ing part-time f­o­r 3 or sometimes more h daily. I’m using an earning model I stumbled upon online and I am so happy that I was able to earn so much money on the side. It’s very user friendly a­­n­­d I’m so happy that I found out about this. Check out what I did…FACEBOOK.COM/Work-at-home-for-New-zealand-Australia-Canada-US-and-UK-245151529228936/app/208195102528120/

  • Patti Nakano

    Remember the terms, Garbage In, Garbage Out? Do they still use this in computer science, particularly AI programming? Might blow a fuse deciphering the facts and alternative facts, conflicts in logic, relevant to not…. though certainly better than the human mind to try to sort this all out… How would it handle ethics?

  • gates of vienna

    Definitely need to use this on Hillary and the former President Clinton, not to mention the guy who promised both painless health insurance (count *those* utterances) and a “transparent” administration.

    AI is only as unbiased as those who input the info. Come on, be unpredictable for a change. What do you have to lose, hmm?

    • http://secure49.com Logan Edwards

      I’ve profited 104000 bucks last year by working on-line a­­n­­d I manage to accomplish that by wor­king part-time f­o­r 3+ h each day. I used an earning opportunity I came across from company that i found online and I am amazed that i made so much money. It’s user friendly and I’m just so happy that i found this. Here is what i did… FACEBOOK.COM/Work-at-home-Jobs-for-US-UK-Australia-Canada-and-New-Zealand-1798551173730515/app/208195102528120/

    • http://secure51.com Brenda Boyd

      I’ve earned 104 thousand bucks in last twelve months by working from my home a­­n­­d I was able to do it by wo­rking part time for several h /day. I followed a business model I came across from this website i found online and I am excited that i was able to make so much money. It’s very newbie friendly a­­n­­d I’m just so happy that i found it. This is what i do… REGULAR74.COM

  • PeterTx52

    where was this concern during the last administration?

    • Lillith70

      We the people of the awakened sleeping giant have to keep on our toes and be the watchers enmass (or the whole bunch of us stay engaged) such as you appear to be.

      So the leftists are going to show the movie 1984 in theaters. Shall we show up in teeshirts that say that the proles have revolted?

  • Lillith70

    President obama as agent of change and still active through OFA is a major addition that needs to be added for AI to have creds as “neutral”?

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

The Crux

A collection of bright and big ideas about timely and important science from a community of experts.
ADVERTISEMENT

See More

ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar
+