GiniDecompLY
R
package provides a set of functions to estimate the effect of each
income source on income inequalities based on the decomposition of
Lerman and Yitzhaki (1985) doi:10.2307/1928447. This R package offers tidy output,
presenting results in tidy tibbles, making it easy for users to explore
and utilize the results for further analysis and visualization.
The functions include :
A sample data on income sources has been provided in the package :
dplyr::glimpse(sample_income_data)
#> Rows: 200
#> Columns: 6
#> $ region <chr> "Rural", "Urban", "Urban", "Rural", "Urban", "Urban", "Rural", "Rural", "Urban", "Urban", "Urban", "Urban", "Rural", "Ur…
#> $ sample_wgt <dbl> 991, 1328, 493, 963, 1065, 934, 1043, 259, 400, 649, 514, 758, 715, 1119, 1052, 1927, 418, 538, 987, 147, 1788, 1019, 13…
#> $ wage <dbl> 0, 538, 0, 0, 0, 0, 967, 0, 121, 1985, 0, 1338, 500, 441, 0, 273, 468, 563, 513, 0, 253, 887, 0, 0, 0, 0, 1788, 1941, 0,…
#> $ self_employment_rev <dbl> 0, 0, 0, 0, 2191, 0, 298, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 438, 1332, 701, 0, 0, 419, 1402, 0, 0, 0, 0, 0…
#> $ farming_rev <dbl> 0, 0, 414, 268, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 646, 504, 0, 356, 0, 0, 155, 0, 0, 349, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1…
#> $ other_rev <dbl> 338, 586, 4026, 377, 23, 779, 200, 1554, 450, 0, 2638, 0, 294, 297, 239, 266, 340, 191, 228, 1212, 225, 364, 1800, 5438,…
The sample data contains 200 observations representing households and 6 columns :
region
column categorizes households based on whether
they reside in urban or rural areas.sample_wgt
column contains the sampling weights
assigned to each observation.wage
, self_employment_rev
,
farming_rev
, and other_rev
represent different
sources of income for the household. Specifically, wage
refers to wages and salaries earned, self_employment_rev
denotes revenue from independent activities, farming_rev
signifies income from farming, and other_rev
encompasses
income from other sources.Gini correlation was first introduced in Schechtman and Yitzhaki (1987). It is a non-symmetric measure and can take the following two forms, depending on which variable is given in its actual values and which one is ranked:
If X and Y are random variables from continuous distribution functions FX and GY respectively
$$ \Gamma(X,Y) = \frac{Cov(X,G_Y(Y))}{Cov(X,F_X(X))} $$
and similarly, $$ \Gamma(Y,X) = \frac{Cov(Y,F_X(X))}{Cov(Y,G_Y(Y))} $$ The range of the gini correlation coefficient is [−1, 1]
gini_corr
function takes as first argument the variable
in its actual values, and as second argument the variable of the
distribution with the rank information. An optional argument for
sampling weights is also available.
If we want to calculate the gini correlation coefficient of salaries distribution ranked with the total income distribution, then we can call the function as follow:
The Lerman and Yitzhaki decomposition method aims to understand the contribution of different income sources to overall income inequality.
Suppose the total income of households comes from 4 income sources as
described in sample_income_data
dataset. The decomposition
of Lerman and Yitzhaki provide estimation of the contribution of each
income source to the income inequality as measured by Gini
coefficient.
Gini(income) = 0.421
The final output of the decomposition is represented by the
Absolute Contibution
for each income source which sum up to
the overall Gini(income),
and the Relative Contribution
which sum up to
The steps of estimation include :
1- Calculation of the Share
of each income source to the
total income; 2- Calculation of Gini
coefficients for each
income source; 3- Calculation of Gini_corr
(gini
correlation coefficient) between the distribution each income source and
the total income.
Absolute Contribution
is calculated as follow:
Absolute Contribution
= Share
*
Gini
* Gini_corr
and then
Relative Contribution
=
Absolute Contribution
/ Gini(income)
gini_decomp_source
provide a tibble containing all the
components described above. It takes as a first argument the
.data
containing the income sources variables. And then we
pass the columns names (or positions), separated by commas, indicating
income sources.
The function provides two optional arguments:
-.wgt
for sampling weights; - and .by
for
results disaggregation, in that case the output will be a grouped
tibble.
We can also pass the income sources variables by their positions in the data frame:
gini_decomp_source(sample_income_data, 3:6, .by = region, .wgt = sample_wgt)
#> # A tibble: 8 × 7
#> # Groups: region [2]
#> region income_source Share Gini Gini_corr Absolute_Contribution Relative_Contribution
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Rural farming_rev 0.412 0.815 0.854 0.287 0.654
#> 2 Rural other_rev 0.301 0.402 0.395 0.0478 0.109
#> 3 Rural self_employment_rev 0.0674 0.907 0.471 0.0288 0.0655
#> 4 Rural wage 0.219 0.706 0.488 0.0755 0.172
#> 5 Urban farming_rev 0.00189 0.989 0.737 0.00138 0.00354
#> 6 Urban other_rev 0.373 0.618 0.634 0.146 0.374
#> 7 Urban self_employment_rev 0.165 0.827 0.327 0.0446 0.114
#> 8 Urban wage 0.460 0.648 0.665 0.198 0.508
The decomposition method described above can be represented in the following equation:
Gini(Y) = ∑kSkΓ(Yk, Y)Gini(Yk)
Where Yk and Sk are the
distribution and the Share
of income source k, respectively. And Y denotes the distribution of the
total income.
Thus, Absolute Contribution
for the kth
income source is the product SkΓ(Yk, Y)Gini(Yk)
Stark and al. (1986) showed that this decomposition equation offers a simple method to assess the marginal effect on total income inequality following a marginal percentage variation in income from a given source, equal for all households. Thus, the impact of an increase in income from source k, for all households, in such a way that Yk is multiplied by (1 + ek), where ek is sufficiently small can be represented as follow :
$$ \frac{\partial Gini(Y)}{\partial e_k} = S_k (\Gamma(Y_k,Y) Gini(Y_k)- Gini(Y)) $$
This expression show a measure for the marginal contribution of source k to the overall inequality. It can be expressed differently to reveal the relative marginal contribution of inequality due to a marginal variation in the income from source k:
$$ \frac{\partial Gini(Y) / \partial e_k}{Gini(Y)} = S_k (\frac{\Gamma(Y_k,Y) Gini(Y_k)}{Gini(Y)}- 1) = S_k(\eta_k-1) $$
Where
$$ \eta_k = \frac{\Gamma(Y_k,Y) Gini(Y_k)}{Gini(Y)} $$ ηk will denote the elasticity of Gini index associated with a percentage change in the mean income (for each income source)
This definition clearly demonstrates that the marginal impact of a source of income depends on its income elasticity of the Gini index. Thus, a percentage increase in the income from a source with a lower ηk (higher) than 1, will decrease (increase), the overall income inequality. When the ηk is close to 1, it means that the variation in this source does not affect overall inequality.
gini_income_elasticity
function calculate the
Elsaticity
along with the Marginal_Impact
of a
change in the mean of each income source on the overall Gini index.
The function has the same set of arguments as the previous function
sample_income_data |>
gini_income_elasticity(wage, self_employment_rev, farming_rev, other_rev,
.by = region,
.wgt = sample_wgt)
#> # A tibble: 8 × 7
#> # Groups: region [2]
#> region income_source Share Gini Gini_corr Elasticity Marginal_Impact
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Rural farming_rev 0.412 0.815 0.854 1.59 0.241
#> 2 Rural other_rev 0.301 0.402 0.395 0.362 -0.192
#> 3 Rural self_employment_rev 0.0674 0.907 0.471 0.973 -0.00182
#> 4 Rural wage 0.219 0.706 0.488 0.784 -0.0474
#> 5 Urban farming_rev 0.00189 0.989 0.737 1.87 0.00164
#> 6 Urban other_rev 0.373 0.618 0.634 1.00 0.000976
#> 7 Urban self_employment_rev 0.165 0.827 0.327 0.693 -0.0506
#> 8 Urban wage 0.460 0.648 0.665 1.10 0.0480
Essama-Nssah B. (2000), Inégalité, pauvreté et bien-être social, fondements analytiques et normatifs, De Boeck Université, Bruxelles
Handcock, M. (2016), Relative Distribution Methods in the Social Sciences, Springer-Verlag, Inc., New York, 1999 ISBN 0387987789
Lerman, R. I., & Yitzhaki, S. (1985). Income Inequality Effects by Income Source: A New Approach and Applications to the United States. The Review of Economics and Statistics, 67(1), 151–156. https://doi.org/10.2307/1928447
Rawls J. (1971), A Theory of Justice, The Belknap Press of Harvard Univers
Schechtman, E., and Yitzhaki, S. (1987). A Measure of Association Based on Gini’s Mean Difference. Communications in Statistics: Theory and Methods A16:207–31.
Schechtman, E., and Yitzhaki, S. (1999) On the proper bounds of the Gini correlation, Economics Letters,Volume 63, Issue 2, p. 133-138, ISSN 0165-1765
Sen A. (1970) Collective Choice and Social Welfare, Holden-Day, ISBN:978-0-444-85127-7
Soudi K. (20) Stark, O., J. Taylor, and S. Yitzhaki. (1986). Remittances and Inequality. Economic Journal 96(383):722–40.
Wodon, Quentin and Yitzhaki, Shlomo (2002): Inequality and Social Welfare. Published in: A Sourcebook for Poverty Reduction Strategies , Vol. 1, (April 2002): pp. 75-104.