Authors & Year: Peng Shi and Gee Y. Lee (2022)
Journal: Journal of the American Statistical Association
Written by Leonard Bryan L. Kho and Andrew Adrian Pua
Have you ever wondered what it’s like to run an insurance company? What role does statistics play in insurance company operations and how can its use be profitable? In this article, we’re going to explore property insurance and a very recent improvement in statistical modeling in this area.
How does insurance work?
Suppose a guy named Bob owns a property and chances of significant damage to it are very small. He feels uneasy because property damage would cause a heavy financial burden.
Now, suppose that for some reason we are willing to pay Bob if damage befalls his property. However, resources are limited, and we are not willing to cover his losses for free. So, we agree with Bob that he’ll pay us a certain amount of money regularly, and in return, we’re going to pay for his loss every time his property experiences damage. In insurance terms, our contract is called a policy. We are the insurer and Bob is the policyholder. The amount of money Bob pays us regularly is called a premium. If we decide to offer Bob a contract, then we have underwritten his policy. When we calculate Bob’s premium, we are ratemaking.
What is a deductible?
We realize that Bob now has less incentive to protect his properties, since we are the ones to pay for his losses. As part of his policy, we offer to decrease Bob’s premium, but in return, he should agree that if the loss does not exceed a certain threshold, he pays for the loss himself. If it does exceed, he shoulders the loss up to the threshold, while we cover the excess. He states the threshold (say, 10% of property’s value), and we recalculate the premium accordingly. This threshold is known as a deductible and is designed to give the policyholder an incentive to still protect his property. Raising the deductible leads to a lower premium and lowering the deductible leads to a higher premium. With the deductible set, our goal is to compute the premium so that we gain profits or at least avoid losing money. Moreover, the premium has to be reasonable enough for Bob. Otherwise, he would find a competing insurer who could offer a lower premium.
What data do we typically have as an insurer?
We would know how many times Bob claims that damages have occurred (known as claim frequency), how large the damage would cost (known as claim severity), and the deductible. On top of that, we would have data on rating variables. These rating variables are a subset of Bob’s personal characteristics such as his income and repayment history and the level of protection he sets for his property, as well as other specifics of the policy.
How can ratemaking be improved?
In reality, insurers have a lot of potential policyholders, and policyholders may want to protect more than just one property. Ratemaking depends on the characteristics of the policyholder and the properties to be insured. Rating variables may be measured with error or cannot be observed without substantial costs and possibly invasions of privacy.
Curiously, ratemaking have traditionally assumed that claim frequency and claim severity are independent of the policyholder’s choice of deductible (deductible is treated as exogenous). In contrast, the authors of this paper improve current statistical modeling approaches by allowing for interdependence of the policyholder’s deductible choice, claim frequency, and claim severity. The key was to model the distribution of these three components by pairs (known as pair copula constructions) and the pairings were made to match the insurance industry context. Pair copula constructions allow the deductible to be endogenous: one might impact the other.
What are the implications of this improvement?
Consider a measure of the policy’s risk known as relativity. Higher deductibles are indicative of low relativity. Under the traditional approach, we will tend to underestimate the risk for low-deductible policies and overestimate it for high-deductible policies. We present two relativity-deductible curves, drawn depending on whether the deductible is modeled as exogenous or endogenous.
Figure: Looking for profits while accounting for policyholder and competitor behavior
Source: Image based on Figure 4 of Shi and Lee (2022). Letters (A, B, C, D) were used to mark areas to be discussed.
Focus on the figure on the left side. Suppose we were the only insurer in the industry using the endogenous-deductible model. This means that we are the only insurer aware that the risks of high-deductible policyholders are being overestimated by the traditional approach. Assuming that other insurers face the same relativity-deductible curve, the other insurers expect the area ABCD to be the potential incurred losses of underwriting high-deductible policies. For us, the potential incurred losses will be lower since the green area is excluded.
How are we to take advantage of this? We charge the same premiums as the other insurers do. Then we underwrite more policies which will be most profitable for us (a practice known as cream skimming). These policies have high deductibles and policyholders are paying too much for premiums.
Now, focus on the figure on the right side. Suppose all insurers were aware that the endogenous-deductible model is more appropriate. There will be stronger incentives to underwrite high-deductible policies, which may force insurers to also underwrite low-deductible policies.
The area represented by ABCD below the relativity-deductible curve under the endogenous-deductible model is the premium we are to earn from underwriting low-deductible policies. If we stick to the traditional approach of ratemaking for low-deductible policies, we will set the premiums too low which leads to an underwriting loss represented by the red region.
What is the takeaway?
In this post, we have illustrated that the improvement in statistical modeling enables insurers to better gauge riskiness of policyholders from their proposed deductible while being competitive against other insurers. Shi and Lee (2022) also show that insurers can also make better predictions of claim frequency and severity, and to be prepared for large payouts when high-risk claims will be filed.
Allowing the deductible to be endogenous means re-examining convenient assumptions made regarding independence, especially in the insurance industry context. Independence assumptions become untenable because the behavior of potential policyholders will tend to simultaneously affect all these three components of interest to insurers (claim frequency, claim severity, and deductible choice) and insurers cannot perfectly observe everything about the policyholder. Thus, the statistical approach of using pair copula constructions represents a step towards greater realism and potentially more profits.
Image credit: Peng Shi and Gee Y. Lee