In an interview following up with Mayank Kejriwal, we learned that the DIG system has been greatly improved with more advanced algorithms, and a friendlier, more streamlined user interface.
(Kejriwal is one of the three University of California researchers who developed the AI tool called DIG, which scours the open Web (and to a lesser degree, the dark Web) looking for illicit and illegal activity, including human trafficking. For more on how even the Department of Defense and the New York prosecutors’ office are using the tool, read our recent story on DIG. The other researchers are Pedro Szekely, and Craig Knoblock, who together detailed their work in the paper “Investigative Knowledge Discovery for Combating Illicit Activities.”)
Computer Society: How has your research progressed since you completed your article?
Kejriwal: The DIG system has undergone significant extensions since we completed that article. First, DIG now incorporates more advanced AI algorithms for extracting knowledge from webpages and making that knowledge available through its graphical search engine. Feedback from investigative users has also been used to make the user interface friendlier and more streamlined. Since we published the article, DARPA has evaluated DIG in controlled settings (with real domain experts) on at least four other investigative domains: securities fraud, narcotics, mail shipment fraud, counterfeit electronics and illegal weapons sales. Results have been very promising, leading to discussions between agencies like the Securities and Exchange Commission and the teams building this technology, including our own.
On a more social front, we have been collaborating with a team of social scientists to understand the dynamics of online sex advertisement. For example, what is the social network of online sex advertisers, and how do online sex rings grow? How much are they concentrated in big cities? How much ethnic homogeneity do we observe within groups? By leveraging the analytics that we developed under DIG funding, we are now in a position to scientifically investigate these questions.
Recently, for example, we used computational models to build a national network of online sex workers. This network has a very interesting structure, and is being constructed over 25 million crawled sex ads that were made available to us by our collaborators. It is, to our knowledge, the first and only network of its kind to exist for the sex advertisement domain. This endeavor continues to be an exciting one for us because it is a real use-case of “AI for Social Good,” and a case study of interdisciplinary research that could have high impact on both computer science and social science. We hope to publish several sets of results arising from this collaboration in the months to come.
Computer Society: What was the feedback from law enforcement agencies and DARPA, in general, and are you still collaborating with them?
Kejriwal: Law enforcement’s feedback on all the MEMEX tools has been very positive, particularly from the office of the District Attorney of New York. Collaborations have been ongoing, with several MEMEX tools transitioned to law enforcement starting late last year. Important components in DIG were included in that transition and are still in use. In terms of research, DIG has continued to be funded under other DARPA and also IARPA programs, and is now being extended for a wide range of use cases, most notably causal exploration (DARPA CauseEx), and geopolitical forecasting (IARPA HFC). We hope to continue using DIG to offer valuable, AI-driven but also human-focused insights to analysts who continue to be inundated with large quantities of messy, raw data. DIG continues to be openly available under a permissive MIT license. Within ISI, we maintain it actively and resolve all issues from our users as they come to our attention via GitHub.
Computer Society: How would you describe DIG in layman’s terms? Is it a software tool? Search engine? Builder of graphics and charts?
Kejriwal: DIG is all of the above actually, a software tool, search engine and builder of graphics and charts as well. In layman’s terms, DIG is an AI tool for building a personalized search engine that finds and analyzes only those parts of the Web that you care about. For example, if you are a human trafficking investigator, DIG would only find and analyze sex ads, so that you don’t have to spend months doing customized data collection and processing. DIG uses AI to extract important pieces of information from these pages and help you answer questions about your domain of interest.
Computer Society: What does the DIG interface look like?
Kejriwal: Here is a screenshot (below), with identifying information blocked out. The underlying pane is when we search for something on the DIG main page while the overlying one shows graphs, structure, etc.