Data Leak: What It Is & How To Prevent It

Almost every other week, there are stories about some high-profile organization experiencing a data leak. In the U.S. alone, about 1,802 publicly reported incidents last year resulted in 4.2 billion private records leaking online. This is only the tip of the iceberg since plenty of organizations that suffer leaks won’t usually report it to authorities or make headlines.

A data leak occurs when sensitive or confidential data is intentionally or unintentionally disclosed to an unauthorized third party. It usually involves the exposure of sensitive files and data such as customer data, contact information, healthcare data, financial information, social security numbers, credit card information, etc.

Leaks can be far more than a temporary terror — they may change the course of your life. Businesses, governments, and individuals alike can experience huge complications from having sensitive information exposed. Whether you are offline or online, hackers can get to you through the internet, Bluetooth, text messages, or the online services that you use.

A small vulnerability can cause a massive data breach without proper attention to detail. And since many people are unaware of how common modern security threats work, they don’t give it enough attention.

Understanding the concept of data leaks

A data leak is when sensitive data is accidentally exposed physically, on the Internet or in any other form including lost hard drives or laptops. This allows cybercriminals to gain unauthorized access to sensitive data without effort. When sensitive data is posted on the dark web following a cyberattack, these events are also classified as data leaks as they help expedite data breaches.

The terms data breach and data leak are often used interchangeably, but that’s incorrect as they’re two separate categories of data compromise.

A data breach is when sensitive data is accessed and compromised in a successful attack.
A data leak is the exposure of sensitive data that could be used to make future data breaches happen faster. For example, stolen data posted in ransomware blogs are classified as data leaks as they could be used to compromise IT networks with less effort. Poor data security practices, such as software misconfigurations, also cause data leaks.

If a cybercriminal identifies a data leak, the exposed data could be used to strategize a successful cyberattack. So by detecting and remediating data leaks before they are discovered, the risk of data breaches is significantly reduced.

How do data leaks happen?

Weak infrastructure. An improperly configured network infrastructure can allow data to be leaked, causing loss or even misuse. For example, cybersecurity company Cognyte left a massive database unsecured, with no authentication or authorization required for access. As a result, more than 5 million records were exposed online.
System error. System errors can leave networks vulnerable. In 2019, a Facebook vulnerability that has since been fixed allowed scammers to scrape the personal data of over 530 million Facebook users across 106 countries, including their email addresses, phone numbers, locations, and other details. In 2021, the data was posted on a hacking forum.
Human error. Recent statistics reveal that human error is the primary cause of data leaks and breaches. Human error can cause leaks of various degrees, from an email sent to the wrong people to massive leaks caused by stolen credentials.
Third-party vulnerabilities. Third-party applications and vendors may need access to your system or network, but they can pose a risk.
Malicious insiders. Leaks caused intentionally by malicious insiders are not as common as accidental leaks. In 2021, four lawyers at the Elliott Greenleaf law firm allegedly stole and deleted company files to help a competing law firm open a new office.

According to a recent report by the Identity Theft Resource Center (ITRC), in 2021, data compromises went up by almost 70%, which is almost 25% more than the previous all-time high record set in 2017.

The average yearly cost of data breaches is nearly $4.5 million in 2021, so it is no wonder that more organizations are now implementing data protection measures to prevent data leakage. This includes the consequences associated with it, such as regulatory fines, lawsuits, and loss of customer trust.

Types of data leaks

Shadow IT

Employees contending with heavy workloads and very stringent deadlines may use workarounds and unapproved third-party applications and solutions to get things done. The resulting infrastructure is called “shadow IT.” Some unsanctioned third-party applications and technology employees are likely to use may include:

Cloud technology and storage
Software-as-a-Service (SaaS) applications
Web applications

Although employees using their own systems and devices can help with productivity, the risk is that shadow IT can lead to unauthorized access to data in the cloud, which can result in information leakage, changes to the data by unapproved users, and data corruption.

Additionally, shadow IT creates blind spots for IT teams who may not become aware of the data leak until it is too late.

Phishing

Phishing continues to be a popular way to attack businesses—because it works. Its tactics can expose and allow exploitation of sensitive company data if an employee:

Clicks on a malicious link in an email
Shares credentials with others
Falls for social engineering scams

The consequences can range from unauthorized data access to the installation of malware and other malicious files.

Legacy tools

Despite technological advances, numerous organizations and their employees are still using certain legacy tools, such as external USB drives, desktop email applications, and public printers. While there is nothing inherently wrong with these tools, they can cause a leak.

Imagine an employee losing a USB drive containing sensitive data in a public place. Or imagine private company documents being printed at home or a public printing center.

Privileged or business users

In 2018, Twitter urged its 330 million users to change and update their passwords after a bug exposed them. This was the result of a problem with the hashing process, which Twitter uses to encrypt its users’ passwords. The social networking site claimed it found and fixed the bug, but this is a good example of potential vulnerability exploits.

Twitter also suffered a potential breach in May 2020, which could have affected businesses using its advertising and analytics platforms. An issue with its cache saw Twitter admit it was “possible” that some users’ email addresses, phone numbers, and the final four digits of their credit card numbers could have been accessed.

What do cybercriminals look for in data leaks?

The main thing that cyber criminals look for is personally identifiable information (PII). Personal information includes social security numbers, credit card numbers and any other personal details that could result in identity theft. Note that not all personally identifiable information (PII) is what you would traditionally think of as confidential information. Simple data like a name or the mother’s maiden name are targets too.

Another common target is medical or protected health information (PHI) as defined in the US HIPAA standard, “information that is created by a health care provider [and] relates to the past, present, or future physical or mental health or condition of any individual.”

Customer Information

This data differs from company to company, but there are usually some common factors involved:

Identity information: name, address, phone number, email address, username, password
Activity information: order and payment history, browsing habits, usage details
Credit card information: card numbers, CVV codes, expiration dates, billing zip codes

Information that is specific to the company can also be exposed. This can be financials for banks and investment groups, medical records for hospitals and insurers or sensitive documents and forms for government entities.

Analytics

Analytics rely on large data sets containing multiple information sources that reveal big-picture trends, patterns and trajectories. As important as analytics are for many businesses, the data needed to perform the analytics can be a vector of attack if not properly secured. Analytics data includes:

Psychographic data: Preferences, personality attributes, demographics, messaging
Behavioral data: Detailed information about how someone uses a website, for example
Modeled data: Predicted attributes based on other information gathered

Analytics gives you a way to understand individuals as a set of data points and predict their next actions with a high degree of accuracy. This may sound abstract but this type of data can be used to sway voters and change the tide of elections by persuading at scale. on can cause reputational damage.

Trade Secrets

This is the most dangerous thing to be exposed in a data leak. Information that is critical to your business and its ability to compete. Trade secrets include:

Plans, formulas, designs: Information about existing or upcoming products and services
Code and software: Proprietary technology the business sells or built for in-house use
Commercial methods: Market strategies and contacts

Exposure of this type of data can devalue the products and services your business provides and undo years of research.

How to prevent data leaks

Mitigation strategies are abundant, but processes can grow in complexity, so it’s wise to partner with cybersecurity services. Cited below is a partial list of some best practices to follow to reduce the risk of data leaks.

Train employees well

Most breaches are a result of human-caused errors. Organizations must educate employees on the perils of data leaks and best practices when it comes to storing, protecting, transmitting or sharing sensitive data. Regular security awareness training helps employees be more alert, responsible and accountable for data security.

It also helps develop security behaviors such as higher sensitivity to phishing attempts, safe browsing and better social media etiquette. All of these help to lower the risk of accidental data leaks.

Tighten employee access and privileges

Avoid giving employees blanket access to all data. Limit use of administrator privileges and enable access to only those employees who require it. Restrict data downloads. Create a zero-trust environment so that only authenticated and authorized users have access to critical systems.

Also mandate the use of multifactor authentication to reduce the risk of identity theft.

Monitor data closely

Review and classify your data regularly. Focus on sensitive data and use data leakage prevention tools to monitor and control the movement of data. Deploy encryption so that sensitive data can be encrypted and secured while at rest, in transit or in motion. Use data discovery tools to carry out content analysis, tracking the movement of sensitive content across the network.

Use mobile device management tools with the ability to remotely wipe devices that have been lost or stolen.

Clamp down on third-party risks

Uber, Samsung, Toyota and others have suffered breaches due to vulnerabilities in third-party suppliers. Be sure to conduct thorough due diligence on critical suppliers and ensure that they deploy best-in-class security standards and processes. Third-party risks also originate from APIs, applications and software.

Remember to maintain a software bill of materials (SBOM) so that businesses can track and monitor the security risks of various components.

Plug loopholes proactively

Scan your IT environment regularly for bugs and vulnerabilities. Patch systems regularly to plug loopholes and vulnerabilities. Configure firewalls, clouds and other security systems so that attackers cannot take advantage of misconfigurations and open ports.

If needed, leverage a third-party provider to test your security defenses by carrying out quarterly penetration tests.

Table of Contents Hide

Understanding the concept of data leaks

How do data leaks happen?