summarytools
快速了解数据全貌
R
packages
## 安装和调用
只需要安装一次,但是每次重新启动R后使用summarytools
,需要重新调用
1 使用举例方法
1.1 stview
1.2 dfSummary
Data Frame Summary
mpg
Dimensions: 234 x 11
Duplicates: 9
----------------------------------------------------------------------------------------------------------
No Variable Stats / Values Freqs (% of Valid) Graph Valid Missing
---- -------------- ------------------------ -------------------- ------------------- ---------- ---------
1 manufacturer 1. dodge 37 (15.8%) III 234 0
[character] 2. toyota 34 (14.5%) II (100.0%) (0.0%)
3. volkswagen 27 (11.5%) II
4. ford 25 (10.7%) II
5. chevrolet 19 ( 8.1%) I
6. audi 18 ( 7.7%) I
7. hyundai 14 ( 6.0%) I
8. subaru 14 ( 6.0%) I
9. nissan 13 ( 5.6%) I
10. honda 9 ( 3.8%)
[ 5 others ] 24 (10.3%) II
2 model 1. caravan 2wd 11 ( 4.7%) 234 0
[character] 2. ram 1500 pickup 4wd 10 ( 4.3%) (100.0%) (0.0%)
3. civic 9 ( 3.8%)
4. dakota pickup 4wd 9 ( 3.8%)
5. jetta 9 ( 3.8%)
6. mustang 9 ( 3.8%)
7. a4 quattro 8 ( 3.4%)
8. grand cherokee 4wd 8 ( 3.4%)
9. impreza awd 8 ( 3.4%)
10. a4 7 ( 3.0%)
[ 28 others ] 146 (62.4%) IIIIIIIIIIII
3 displ Mean (sd) : 3.5 (1.3) 35 distinct values : . 234 0
[numeric] min < med < max: : : . . . (100.0%) (0.0%)
1.6 < 3.3 < 7 : : : . : :
IQR (CV) : 2.2 (0.4) : : : : : : . :
: : : : : : : : .
4 year Min : 1999 1999 : 117 (50.0%) IIIIIIIIII 234 0
[integer] Mean : 2003.5 2008 : 117 (50.0%) IIIIIIIIII (100.0%) (0.0%)
Max : 2008
5 cyl Mean (sd) : 5.9 (1.6) 4 : 81 (34.6%) IIIIII 234 0
[integer] min < med < max: 5 : 4 ( 1.7%) (100.0%) (0.0%)
4 < 6 < 8 6 : 79 (33.8%) IIIIII
IQR (CV) : 4 (0.3) 8 : 70 (29.9%) IIIII
6 trans 1. auto(av) 5 ( 2.1%) 234 0
[character] 2. auto(l3) 2 ( 0.9%) (100.0%) (0.0%)
3. auto(l4) 83 (35.5%) IIIIIII
4. auto(l5) 39 (16.7%) III
5. auto(l6) 6 ( 2.6%)
6. auto(s4) 3 ( 1.3%)
7. auto(s5) 3 ( 1.3%)
8. auto(s6) 16 ( 6.8%) I
9. manual(m5) 58 (24.8%) IIII
10. manual(m6) 19 ( 8.1%) I
7 drv 1. 4 103 (44.0%) IIIIIIII 234 0
[character] 2. f 106 (45.3%) IIIIIIIII (100.0%) (0.0%)
3. r 25 (10.7%) II
8 cty Mean (sd) : 16.9 (4.3) 21 distinct values : : 234 0
[integer] min < med < max: : : (100.0%) (0.0%)
9 < 17 < 35 : :
IQR (CV) : 5 (0.3) : : :
. : : : .
9 hwy Mean (sd) : 23.4 (6) 27 distinct values . : 234 0
[integer] min < med < max: : : (100.0%) (0.0%)
12 < 24 < 44 : : :
IQR (CV) : 9 (0.3) : : :
: : : : :
10 fl 1. c 1 ( 0.4%) 234 0
[character] 2. d 5 ( 2.1%) (100.0%) (0.0%)
3. e 8 ( 3.4%)
4. p 52 (22.2%) IIII
5. r 168 (71.8%) IIIIIIIIIIIIII
11 class 1. 2seater 5 ( 2.1%) 234 0
[character] 2. compact 47 (20.1%) IIII (100.0%) (0.0%)
3. midsize 41 (17.5%) III
4. minivan 11 ( 4.7%)
5. pickup 33 (14.1%) II
6. subcompact 35 (15.0%) II
7. suv 62 (26.5%) IIIII
----------------------------------------------------------------------------------------------------------
1.3 descr: 描述统计 (Descriptive Statistics)
这个命令适合用于了解数字型数据。
我们也可以只看部分统计指标