Auditing Big Data

Home » Audit & Assurance » Auditing Big Data
Home » Audit & Assurance » Auditing Big Data

Auditing Big Data

The difference between data and information is usually understood. Data is raw, and requires processing. Data can be simple but random and needs organization. When the data is processed, organized, analyzed and converted into information, then it is ready to help in the decision making.

Big data means large, complex data sets (such as website hits, social media posts, etc.) that may require significant processing capabilities and cannot be processed using traditional method. The data can be originated from various sources such as social media, websites, various surveys, etc.

Technical advances in storage capacity, super human processors combined with analytic tools, has make organizations to use big data to their competitive advantages and understand emerging business trends. With opportunities comes the risk also. Risks such as privacy, security, data integrity, quality and complexities.

Auditing big data is the new challenge considering the risk associated with it, Risk Advisory across the globe is finding solutions to handle this. Auditing principles such as looking at completeness and accuracy of data still holds good, however, change in the audit methodologies is the requirement of the hour.

Steps of approaching audit of big data remain same i.e. risk assessment, map the controls and to assess the same from the perspective of design and operating effectiveness and customizing the audit approach.

A typical audit program for big data review would be:


  • Management of data -> in-house or through a service provider?
  • Contract detailing the roles, obligations and responsibilities (if managed through service provider)
  • Is data ownership clearly defined (if managed internally)
  • Strategy for data storage – own server, cloud or in a data center?
  • Responsibility for handling of data – Documented?
  • Business continuity and disaster recovery plans (in place or not)?

Input / Source data validation

  • Whether sources of data identified or not?
  • How to ensure the reliability of the data?
  • Whether data is captured in the format so that it can be used and processed?
  • How is completeness and accuracy of data ensured?
  • How it is being ensured that data is safe?
  • Whether the company has adequate tools for analytics?
  • Whether information processing objectives met by the data analytics tools?
  • Are results of tool reliable?
  • How are the results presented?
  • How to ensure the integrity of information received?

Storage and retrieval

  • How to ensure the continuous availability of data ensured?
  • Policies exist on data storage, restoration and retrieval is in place?
  • How is data backup and restoration done?



The information contained herein is in summary form and is based on information available in public domain. While the information is believed to be accurate to the best of our knowledge, we do not make any representations or warranties, express or implied, as to the accuracy or completeness of such information. This document is not an offer, invitation, advice or solicitation of any kind. We accept no responsibility for any errors it may contain or for any loss, howsoever caused or sustained, by the person who relies on it.