LexisNexis' Flavio Villanustre: Insurance and the Big Data Bonanza
"Too often a company needs to do processing on the data it stores in the cloud," said Flavio Villanustre, head of LexisNexis HPCC Systems and vice president of information security at LexisNexis Risk Solutions. "When you do processing in the cloud, that requires you to have an encryption key available somewhere. So anyone with access to your data processing in the cloud has access to the encryption key."
Insurance companies have always been adamant about spotting and controlling risks. That, after all, is the basis for accepting policyholders and placing them into rate categories.
Before the Big Data explosion, insurance companies crunched numbers like everybody else, relying on limited information gathering and spreadsheet analysis. Today, however, the insurance industry is fast becoming one of the biggest consumers of Big Data services.
LexisNexis, like many of the companies that provide Big Data services, relies on an open source platform to manage the process. It uses a High Performance Computing Cluster, which is a massive parallel-processing computing platform, to solve Big Data problems.
"The insurance companies do not want to charge a premium that is too low so they lose money," explained Flavio Villanustre, vice president of infrastructure and products for LexisNexis HPCC Systems and vice president of information security of LexisNexis Risk Solutions.
"They quote premium prices based on recent indicators of risk," Villanustre added. "They also factor in previous experience about claims. If there was a loss, how much did it cost them, etc. Before Big Data they did this as best as they could given the limited data that they had."
In this exclusive interview, LinuxInsider talks to Villanustre about how the insurance industry is using Big Data and open source solutions to spot and prosecute fraud cases.
LinuxInsider: What do you see as the single most important contribution Big Data brings to the insurance industry?
Flavio Villanustre: Big Data is a lot more precise. In today's extremely competitive world, insurance companies are squeezing their profit margins. Customers will choose the cheapest premium offer that gives them their desired level of service. In order for insurance companies to better compete, they use better predictors of risk than they were able to do without Big Data.
LI: Does the insurance industry target deliberate misinformation in applications or fraudulent claim payouts through Big Data?
Villanustre: Insurance companies target both of those things and more through Big Data. An applicant might not report a previous accident when applying to another insurance carrier in an effort to get a better quote. Now, insurance companies are able to share information about claims history for any applicant. This lets them not lose money on future claims from that person by charging higher premiums initially.
At the back end of this process is the ability to spot claims patterns. Insurance companies are able to detect details that a claimant reported as a one-off situation. Even more essential is the ability to use Big Data to detect organized crime activities in insurance claims. A person might process a claim in such a way that tries to fly under the radar. Each of a person's claims may be small enough for the insurance company not to be suspicious, but when you take the total amount of claims with a series for insurance companies, you might get a much bigger picture of things that may amount to being much worse than you thought. Unfortunately, in many of those cases you might be seeing different people making the claims and not realize that they are connected.
LI: How much concern should consumers have for security of their data stored in massive Big Data computers?
Villanustre: When it comes to storing the data, the security issues are not much different from any company handling data, but there are some new challenges presented with Big Data. These are not on the customers' side as much as on the Big Data processing service's side. For instance, we have affiliations with a variety of places that provide the data. To make this collaboration work, we have to allow access to our data from a variety of different groups. In a more traditional data process you would restrict that to perhaps one group only. There are a number of consumer protection laws that must be followed as well, and you need to apply a common sense approach and try to not be creepy.
LI: What are the concerns regarding the use of Big Data and cloud storage?
Villanustre: Security is one. When you rely on a service to acquire and store data, security is always an issue. If you use a public service like Dropbox for company or personal cloud storage, there are a couple of ways you can use it. You can put your data there, and it will most likely be OK. But you should not think about storing your sensitive data there unless you encrypt the data yourself and then send it. Then you download it and decrypt it yourself. The storage facility has no idea of your contents.
Too often a company needs to do processing on the data it stores in the cloud. When you do processing in the cloud, that requires you to have an encryption key available somewhere. So anyone with access to your data processing in the cloud has access to the encryption key. That makes your data insecure.
LI: What is the best approach for managing that security concern?
Villanustre: You need to rely on your contract with the cloud provider to ensure that they employ security that is at least as good as yours or better. Otherwise, there is too much risk.
LI: Are there misconceptions that consumers have about Big Data and cloud storage?
Villanustre: You know how advertising works. The companies have nice, shiny documents that show all of the benefits. What they do not show you are the downsides. For instance, if I go to a cloud provider about hosting access to the site, I must ask if they are liable for any breaches. Of course, they are not going to offer that they are not.
Another item that is often underestimated is the cost and the logistics of uploading data. You also have to take into account where the data is located. Sometimes it makes more sense to be closer to where the data is located.
LI: Is that concern over the distance of transmission related to increased risk of losing data?
Villanustre: Data is always susceptible to a threat of intrusion or theft. So, how you get it there matters. Even with sending encrypted data, everything can be decrypted given enough time and computing power. Time is a factor. If you use high-level data encryption, sensitive data can lose value over time. It is no different than someone sending data delivered in a truck. There is always the risk of the truck being hijacked.
Data management is a concern as well. What happens if you said you shipped 10 drives and the other side says that you only shipped nine drives? Maybe you made a mistake. Either way, you have to rebuild what you think may be missing. That is why U.S. laws require that you report any consumer data that may be lost or breached. So you are in for a long trek if something like that happens to your data.
LI: Do you see the cloud storage industry minimizing these concerns or will data loss always be a major concern?
Villanustre: I think eventually these problems will be solved. Cloud technology has only been around for 10 years or so. The reality is, it still has a long way to go. Until then, there are some immature elements of cloud computing that will keep data more at risk.