Technology Mash-up: Integrating Computer Forensics and Data Analytics

GUEST BLOGGER

Jeremy Clopton, CFE, CPA, ACDA
Senior Managing Consultant, Forensics and Valuation Services, BKD, LLP

A recent Journal of Accountancy article focused on the integration of unstructured data into the risk management process. The authors discussed various ways to use unstructured data proactively for risk management and identification. They also included an eight-step approach to accomplish this goal. Two of the topics from that article, used together, can create an efficient and effective investigation tool. 

In many investigations, computer forensics experts image and obtain data (much of it unstructured) from computers used by individuals who are part of the investigation. Investigations also include the use of data analytics experts for the analysis of transactional data. Utilizing these experts together may help increase the effectiveness and efficiency of investigating the unstructured data. As I mentioned in a previous post, at its core unstructured data is still data and, as such, is analyzed in much the same way as transactional data. 

Text mining is the most common catchall term for the process of analytics related to unstructured data. This is much more than just keyword searches and sorting. Text mining is a family of tools and procedures that, applied collectively, form an impressive investigative toolset and methodology. Some of the functions in the text mining family include:

  • Traditional searching: keyword searches, indexing and traditional computer forensics
  • Topic mapping: automated extraction and analysis of key topics, themes and concepts over time
  • Part of speech tagging: analysis of grammatical structure of communications to assist in identifying tones, entities, individuals and concepts
  • Tone detection: analyzing the sentiment or emotional tone of communications
  • Named entity extraction: identification of key entities and individuals within documents and communications
  • Predictive coding/natural language processing: artificial intelligence-assisted analysis used to identify similar documents and content for more effective review

Leveraging the information gathered during the analysis of unstructured data enhances the analysis of more traditional structured data. For example, take two employees who, through email analysis, are found to consistently discuss a vendor in vague or conspiratorial “tones.” The analyst extracts the vendor name, to/from, date/times and overall emotional tone from that email chain. She then integrates that data into her analysis of purchasing activity, specifically focusing on unusual trends or patterns for that vendor on or around those dates. This process utilizes both the computer forensics and data analytics experts for a more comprehensive analysis. 

While the article in Journal of Accountancy discusses more preventative measures, the principle holds true for investigative measures. Leveraging the unstructured data within an organization can have a profound impact on risk management – both from the preventative and investigative standpoints.