No one in either a technical or strategic role in business is likely to underestimate the importance of big data, but many organizations overlook the value of data preparation in the data analysis process. Preparing data involves cleaning, organizing and presenting raw data in a way so that different departments on the operations side can use to make informed decisions more effectively and in less time.
One of the first steps in data preparation is filtering raw information in order to clarify what is and is not needed to answer the questions at hand. Communication between technical and operations staff is essential to this step. Business analysts and subject matter experts must supply IT with the information required to develop appropriate filters. Setting the net too wide creates cumbersome reports that make analysis difficult, while getting too specific with queries may leave out data relevant to the discussion. Filters should limit data in a general yet helpful way. For example, a marketing department analyzing results of the last campaign might begin data preparation with a filter by phone numbers or backlinks related to a campaign.
The next step in preparing data for analysis is eliminating bad or unnecessary information, a process commonly referred to as scrubbing. Scrubbing involves removing duplicate entries, blank rows or fields, or unnecessary data points pulled by a query. An accounting department looking for data on spending doesn't necessarily need to see the product SKUs for each order, and may be satisfied with just the item, department, unit price and total expense fields. Additionally, not all data should be shared with all employees, which means that sensitive information, such as Social Security numbers, must be removed before data is made available.
The final step in data preparation is formatting information in a way that is conducive to data analysis. The format depends, of course, on who is performing the analysis and what programs are available to them. In some companies, analysts in each department have access to the same report-writing and database software that technical employees do. In others, a database administrator may need to pull and scrub the data before packaging it in a Microsoft database or Excel file. Those that take the extra time to add a bit of formatting, rather than leaving data in a flat format such as a CSV file, are likely to become favorites of analysts and managers in other departments.
One of the challenges with data preparation is that most individuals on the operations side take it for granted. They request reports without realizing how much work goes into pulling and preparing the information in order for it to be useful. Instead of passing on unusable data and frustrating operations, technical managers should make a point of explain the process to other business leaders so that they understand the time and effort involved in meeting each request.
Data preparation is the unsung hero of the entire data analysis process. By working closely together, business and technical departments can create more efficient analysis processes. This will help drive better and more efficient decision-making across the entire organization.
Photo courtesy of Stuart Miles at FreeDigitalPhotos.net