The volume of data we store is growing exponentially. Research by Verizon suggests that data stored will double in the next 18 months, with this knowledge being duplicated up to 19 times in an organisation. It's not hard to see how this happens. Imagine this scenario:
Ted in sales asks Vera in IT to extract some customer information for a marketing proposal. Vera pulls off 1,000 records from the CRM system into a CSV file and emails them to Ted.
So far so simple? A copy on the system and an email copy, right? Wrong.
We've got 5 copies of the data already: one on the system (1), one .csv file on Vera's PC (that's 2), one in Vera's sent items (3), one in Ted's inbox (4), one on Ted's PC (5)... then we need to double it to allow for backups. One extract and email means ten copies of the data to store and manage. If Ted then imports it into excel, sending a copy to Tim in marketing who passes it to a third party mailing house, we're up to those 19 copies in no time at all - and large organisations will have processes like this taking place hundreds or thousands of times a day.
At the end of the day, if we fixed that process all that was needed to achieve the desired outcome was the original copy and it's backup.
The cost of managing this vast stockpile of information is horrific, and made worse by the fact that most of it is effectively worthless. We might as well be burning banknotes. However few companies could claim to be good enough to need no more than two copies of anything. Improvement often requires considerable investment, duplicated across every corporate process.
In the meantime though some essential controls would help, for example:
- Shared folders reduce the need for emailing data, but they are not always available and staff do not always use them;
- Internal email is often not monitored, so internal data sharing won't be identified;
- Internal FTP arrangements or dedicated transfer areas could be established to transfer between departments without shared folders;
- A data retention policy could be established to ensure data is not kept longer than necessary;
- Data storage costs and savings can be built into project proposals; and
- Clarity as to the company's information strategy would help ensure that data was only held and shared if it was likely to be needed and in the company's long term interest. Cutting the number of data fields in a major database or removing obsolete customer records would make a big difference if you account for 19 copies, not just one.
Ultimately though, we all need to get a lot cleverer about how we share information. Every bit and every byte has a cost - storage, backup, BCP, security, integrity, restoration, forensic analysis... so why do we continue to incur these costs without challenge?
Do you even know how many customer records you hold? I'll admit it: I don't.