09: Zpracování a analýza dat
Tento notebook je výukovým materiálem v předmětu BI-JUL.21 vyučovaném v zimním semestru akademického roku 2023/2024 Tomášem Kalvodou. Tvorba těchto materiálů byla podpořena NVS FIT.
Hlavní stránkou předmětu, kde jsou i další notebooky a zajímavé informace, je jeho Course Pages stránka.
Julia Version 1.12.0
Commit b907bd0600f (2025-10-07 15:42 UTC)
Build Info:
Official https://julialang.org release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 8 × Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
WORD_SIZE: 64
LLVM: libLLVM-18.1.7 (ORCJIT, skylake)
GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 8 virtual cores)
Pro práci s daty jistě znáte Pythonovský nástroj pandas.
DataFrames.jl je v podstatě Julia analog tohoto nástroje.
Pokud jste zvyklí pandas používat, může pro vás být užitečné porovnání pandas s Dataframes.jl.
1.1 Jak vytvořit DataFrame?
DataFrame lze vytvořit mnoha způsoby.
Můžeme začít s prázdnou tabulkou a postupně ji naplnit daty, nebo využít existující matici, nebo data načíst z externího souboru.
Prázdný DataFrame vytvoříme velmi snadno:
Keyword argumenty a NamedTuple
Data v pojmenovaných sloupcích můžeme předat pomocí keyword argumentů (klíčové slovo je název sloupce, hodnota data):
| Row | course | semester | department |
|---|
| String | Int64 | Int64 |
|---|
| 1 | BI-LA1 | 1 | 18105 |
| 2 | BI-DML | 1 | 18105 |
| 3 | BI-MA1 | 2 | 18105 |
| 4 | BI-MA2 | 2 | 18105 |
4-element Vector{String}:
"BI-LA1"
"BI-DML"
"BI-MA1"
"BI-MA2"4-element Vector{String}:
"BI-LA1"
"BI-DML"
"BI-MA1"
"BI-MA2"4-element Vector{String}:
"BI-LA1"
"BI-DML"
"BI-MA1"
"BI-MA2"Tímto způsobem bychom ovšem měli problém zadat data se sloupcích, jejichž názvy obsahují třeba speciální znaky jako mezery.
K tomu můžeme použít slovník, resp. dvojice (klíče mohou být řetězce nebo symboly -- ty jsou doporučené, místo mezer je vhodnější použít podtžítka :slovo_slovo):
| Row | fiktivní postava | kniha |
|---|
| String | String |
|---|
| 1 | Gandalf | Pán prstenů |
| 2 | Harry Potter | Harry Potter a kámen mudrců |
2-element Vector{String}:
"Pán prstenů"
"Harry Potter a kámen mudrců"2-element Vector{String}:
"Gandalf"
"Harry Potter"2-element Vector{String}:
"Gandalf"
"Harry Potter"2-element Vector{String}:
"Gandalf"
"Harry Potter"Další možností je použít NamedTuple:
Po sloupcích (pozor na jemný rozdíl od výše uvedeného způsobu):
| Row | a | b | c |
|---|
| Int64 | Union… | Union… |
|---|
| 1 | 1 | 4 | |
| 2 | 2 | 5 | |
| 3 | 3 | | 6 |
| Row | variable | mean | min | median | max | nmissing | eltype |
|---|
| Symbol | Union… | Union… | Union… | Union… | Int64 | Type |
|---|
| 1 | a | 2.0 | 1 | 2.0 | 3 | 0 | Int64 |
| 2 | b | | | | | 0 | Union{Nothing, Int64} |
| 3 | c | | | | | 0 | Union{Nothing, Int64} |
Matice
K vytvoření DataFrame můžeme použít i matici, jen musíme vyřešit pojmenování sloupců.
Automaticky (nutno zadat jako druhý argument :auto) budou označeny jako x1, x2, atd.:
| Row | x1 | x2 | x3 |
|---|
| Float64 | Float64 | Float64 |
|---|
| 1 | 0.65177 | 0.801115 | 0.532499 |
| 2 | 0.967483 | 0.328116 | 0.519358 |
| Row | x1 | x2 | x3 |
|---|
| Int64 | Int64 | Int64 |
|---|
| 1 | -7919055112889388537 | 2828097916093900641 | 3985183950275755778 |
| 2 | -2546959018841836274 | 4269260951158923426 | -544554040784498185 |
V druhém argumentu případně můžeme zadat naše názvy sloupců:
| Row | col1 | col2 | col3 |
|---|
| Float64 | Float64 | Float64 |
|---|
| 1 | 0.196401 | 0.36887 | 0.919708 |
| 2 | 0.010729 | 0.00947155 | 0.520836 |
V tomto notebooku budeme pro některé ukázky používat CSV anonymizovaný export z Grades předmětu BI-MA1 v semestru B212 (2021/2022).
Importu dat z CSV lze snadno docílit pomocí balíčku CSV.jl, který přidáme standardně ] add CSV a pak importujeme:
Nyní stačí použít metodu read z modulu CSV, v prvním argumentu zadáme cestu k souboru a v druhém uvedeme požadovaný výstupní "formát", v našem případě DataFrame:
659×17 DataFrame
634 rows omitted
| Row | username | test1 | test2 | test3 | second_chance | activity | tests_total | gitlab | assessment | exam_test | date | oral_exam | veto | points_total | mark | tutor | percentil |
|---|
| String15 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64? | Bool | Int64? | String7? | Float64? | Bool? | Float64? | String1? | String7 | Int64 |
|---|
| 1 | student001 | 3.0 | 11.5 | 13.5 | missing | 5.0 | 28.0 | missing | true | 17 | 13:15 | 34.0 | missing | 67.0 | D | Irena | 58 |
| 2 | student002 | 2.0 | 3.0 | 18.5 | 19.5 | 3.0 | 25.0 | missing | true | 18 | 12:45 | 48.0 | missing | 76.0 | C | Irena | 43 |
| 3 | student003 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Ondra | 6 |
| 4 | student004 | 3.0 | missing | missing | missing | 4.0 | 3.0 | missing | false | missing | missing | missing | missing | missing | missing | Jan V | 13 |
| 5 | student005 | 3.0 | 18.0 | 17.0 | missing | 4.0 | 38.0 | missing | true | 19 | 17:15 | 56.0 | missing | 98.0 | A | Jarda | 93 |
| 6 | student006 | 3.0 | 10.5 | 15.0 | missing | missing | 28.5 | missing | true | 15 | 16:30 | 50.0 | missing | 78.5 | C | Jan S | 59 |
| 7 | student007 | 2.0 | 19.0 | 16.0 | missing | 2.0 | 37.0 | missing | true | 16 | 11:30 | 55.0 | missing | 94.0 | A | Jan V | 91 |
| 8 | student008 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 6 |
| 9 | student009 | 3.0 | 8.0 | 14.0 | missing | 3.0 | 25.0 | missing | true | 18 | 13:15 | 42.0 | missing | 70.0 | C | Irena | 46 |
| 10 | student010 | 1.0 | 6.0 | 7.0 | missing | missing | 14.0 | missing | false | missing | missing | missing | missing | missing | missing | Jan S | 27 |
| 11 | student011 | 3.0 | 14.5 | 19.0 | missing | 4.0 | 36.5 | missing | true | 17 | 15:00 | 57.0 | missing | 97.5 | A | Jakub | 89 |
| 12 | student012 | 3.0 | 18.0 | 15.0 | missing | missing | 36.0 | missing | true | 17 | 11:30 | 54.0 | missing | 90.0 | A | Jan V | 88 |
| 13 | student013 | 2.0 | missing | missing | missing | missing | 2.0 | missing | false | missing | missing | missing | missing | missing | missing | Irena | 11 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 648 | student648 | 1.0 | 13.5 | 14.5 | missing | missing | 29.0 | missing | true | 15 | 11:00 | 45.0 | missing | 74.0 | C | Jan V | 62 |
| 649 | student649 | 1.0 | missing | missing | missing | missing | 1.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 9 |
| 650 | student650 | 2.5 | 1.0 | 1.0 | missing | missing | 4.5 | missing | false | missing | missing | missing | missing | missing | missing | Ondra | 17 |
| 651 | student651 | 0.0 | 8.0 | 8.0 | missing | 1.0 | 16.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 30 |
| 652 | student652 | 2.0 | 5.5 | 12.5 | 12.5 | 1.0 | 20.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 35 |
| 653 | student653 | 3.0 | 8.5 | 17.5 | missing | missing | 29.0 | missing | true | 16 | 14:45 | 55.0 | missing | 84.0 | B | Jitka | 62 |
| 654 | student654 | 1.0 | 14.0 | 15.0 | missing | missing | 30.0 | missing | true | 16 | 14:30 | 50.0 | missing | 80.0 | B | Jan S | 65 |
| 655 | student655 | 1.5 | 5.0 | missing | missing | missing | 6.5 | missing | false | missing | missing | missing | missing | missing | missing | Irena | 19 |
| 656 | student656 | 2.5 | 5.5 | 17.5 | missing | missing | 25.5 | missing | true | 16 | 10:45 | 54.5 | missing | 80.0 | B | Ondra | 49 |
| 657 | student657 | 1.0 | 15.0 | 6.5 | 16.0 | 1.0 | 25.0 | missing | true | 15 | 14:15 | 42.0 | missing | 68.0 | D | Irena | 41 |
| 658 | student658 | 1.0 | 15.0 | 17.5 | missing | 0.5 | 33.5 | missing | true | 17 | 13:45 | 56.0 | missing | 90.0 | A | Jarda | 79 |
| 659 | student659 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Jakub | 6 |
K těmto datům se vrátíme podrobněji později. Samotnou tabulku občas použijeme v různých ukázkách.
1.2 Práce s DataFrame
Z částečného výpisu dat výše vyvstává hned několik otázek:
- Jaké sloupce jsou ještě k dispozici?
- Proč je typ pod některými sloupci s otazníkem?
- Co znamená missing?
Pojďme se vydat na průzkum.
První, co můžeme zkusit, je podívat se na atributy naší instance (interaktivně pomocí TAB):
659-element Vector{String15}:
"student001"
"student002"
"student003"
"student004"
"student005"
"student006"
"student007"
"student008"
"student009"
"student010"
"student011"
"student012"
"student013"
⋮
"student648"
"student649"
"student650"
"student651"
"student652"
"student653"
"student654"
"student655"
"student656"
"student657"
"student658"
"student659"659-element Vector{Bool}:
1
1
0
0
1
1
1
0
1
0
1
1
0
⋮
1
0
0
0
0
1
1
0
1
1
1
0Nebo k tomu můžeme použít metodu names z modulu DataFrames:
17-element Vector{String}:
"username"
"test1"
"test2"
"test3"
"second_chance"
"activity"
"tests_total"
"gitlab"
"assessment"
"exam_test"
"date"
"oral_exam"
"veto"
"points_total"
"mark"
"tutor"
"percentil"Metoda eachcol vrátí iterátor přes sloupce. Můžeme tak zjistit typ prvků sloupců:
17-element Vector{Type}:
String15
Union{Missing, Float64}
Union{Missing, Float64}
Union{Missing, Float64}
Union{Missing, Float64}
Union{Missing, Float64}
Float64
Union{Missing, Float64}
Bool
Union{Missing, Int64}
Union{Missing, String7}
Union{Missing, Float64}
Union{Missing, Bool}
Union{Missing, Float64}
Union{Missing, String1}
String7
Int64Odtud vidíme, co znamenají otazníky u typů prvků sloupců.
Prvky sloupce jsou složeného typu Union{Missing, T}, v některých sloupcích mohou hodnoty chybět, obsahují hodnotu missing.
Stručné informace můžeme získat pomocí metody describe.
Samozřejmě občas uvedené statistiky nemají moc smysl.
| Row | variable | mean | min | median | max | nmissing | eltype |
|---|
| Symbol | Union… | Any | Union… | Any | Int64 | Type |
|---|
| 1 | username | | student001 | | student659 | 0 | String15 |
| 2 | test1 | 2.02801 | 0.0 | 2.0 | 3.0 | 70 | Union{Missing, Float64} |
| 3 | test2 | 11.5804 | 0.0 | 12.0 | 20.0 | 68 | Union{Missing, Float64} |
| 4 | test3 | 13.5825 | 0.0 | 14.5 | 20.0 | 150 | Union{Missing, Float64} |
| 5 | second_chance | 12.6964 | 2.5 | 14.0 | 22.5 | 589 | Union{Missing, Float64} |
| 6 | activity | 2.38564 | 0.0 | 2.0 | 5.0 | 334 | Union{Missing, Float64} |
| 7 | tests_total | 22.8437 | 0.0 | 26.5 | 43.0 | 0 | Float64 |
| 8 | gitlab | 0.814815 | 0.5 | 0.5 | 4.0 | 632 | Union{Missing, Float64} |
| 9 | assessment | 0.616085 | false | 1.0 | true | 0 | Bool |
| 10 | exam_test | 16.9113 | -1 | 17.0 | 20 | 253 | Union{Missing, Int64} |
| 11 | date | | --- | | 17:30 | 254 | Union{Missing, String7} |
| 12 | oral_exam | 47.7257 | 0.0 | 49.0 | 57.0 | 278 | Union{Missing, Float64} |
| 13 | veto | 1.0 | true | 1.0 | true | 657 | Union{Missing, Bool} |
| 14 | points_total | 81.5019 | 55.0 | 81.5 | 105.0 | 279 | Union{Missing, Float64} |
| 15 | mark | | A | | F | 253 | Union{Missing, String1} |
| 16 | tutor | | Irena | | Ondra | 0 | String7 |
| 17 | percentil | 50.3612 | 0 | 50.0 | 100 | 0 | Int64 |
Výpis můžeme kontrolovat uvedením konkrétního rozsahu.
| Row | variable | mean | min | median | max | nmissing | eltype |
|---|
| Symbol | Float64 | Float64 | Float64 | Float64 | Int64 | Union |
|---|
| 1 | test1 | 2.02801 | 0.0 | 2.0 | 3.0 | 70 | Union{Missing, Float64} |
| 2 | test2 | 11.5804 | 0.0 | 12.0 | 20.0 | 68 | Union{Missing, Float64} |
| 3 | test3 | 13.5825 | 0.0 | 14.5 | 20.0 | 150 | Union{Missing, Float64} |
DataFrame je tabulka, máme pro ní k dispozici podobné metody jako pro matice, na které se také lze dívat jako na "tabulky".
DataFrame lze kopírovat pomocí metody copy, nebo ho vyprázdnit pomocí metod empty, resp. empty!.
| Row | x1 | x2 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.340103 | 0.561576 |
| 2 | 0.348426 | 0.452242 |
| 3 | 0.772511 | 0.292019 |
| Row | x1 | x2 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.340103 | 0.561576 |
| 2 | 0.348426 | 0.452242 |
| 3 | 0.772511 | 0.292019 |
| Row | x1 | x2 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.340103 | 0.561576 |
| 2 | 0.348426 | 0.452242 |
| 3 | 0.772511 | 0.292019 |
To by bylo pro začátek vše k základním vlastnostem.
Pojďme nyní tabulky modifikovat, upravovat.
Indexování a přístup ke sloupcům
Indexování vychází z maticového zápisu a má podobné vlastnosti.
Ke sloupcům můžeme přistupovat několika způsoby:
- pomocí názvu a tečky:
df.mark, df."mark"
- pomocí indexace (nevytvoří kopii):
df[!, :mark], df[!, "mark"]
- pomocí indexace (vytvoří kopii):
df[:, :mark], df[:, "mark"]
Tímto způsobem můžeme data číst, ale i je modifikovat.
659-element PooledArrays.PooledVector{Union{Missing, String1}, UInt32, Vector{UInt32}}:
"D"
"C"
missing
missing
"A"
"C"
"A"
missing
"C"
missing
"A"
"A"
missing
⋮
"C"
missing
missing
missing
missing
"B"
"B"
missing
"B"
"D"
"A"
missingVedle toho nemusíme používat název sloupce, v pořádku je použít i pořadí.
Například prvních deset řádků v druhém a třetím sloupci:
| 1 | 3.0 | 11.5 | 13.5 |
| 2 | 2.0 | 3.0 | 18.5 |
| 3 | missing | missing | missing |
| 4 | 3.0 | missing | missing |
| 5 | 3.0 | 18.0 | 17.0 |
| 6 | 3.0 | 10.5 | 15.0 |
| 7 | 2.0 | 19.0 | 16.0 |
| 8 | missing | missing | missing |
| 9 | 3.0 | 8.0 | 14.0 |
| 10 | 1.0 | 6.0 | 7.0 |
Not, Between, Cols, All
Dále můžeme vybírat pouze potřebné sloupce.
| Row | x1 | x2 | x3 | x4 | x5 |
|---|
| Float64 | Float64 | Float64 | Float64 | Float64 |
|---|
| 1 | 0.510751 | 0.126475 | 0.995901 | 0.940862 | 0.0399312 |
| 2 | 0.934257 | 0.0010669 | 0.22795 | 0.807367 | 0.176827 |
| 3 | 0.266104 | 0.482422 | 0.851943 | 0.549124 | 0.493393 |
| Row | x1 | x3 | x4 | x5 |
|---|
| Float64 | Float64 | Float64 | Float64 |
|---|
| 1 | 0.510751 | 0.995901 | 0.940862 | 0.0399312 |
| 2 | 0.934257 | 0.22795 | 0.807367 | 0.176827 |
| 3 | 0.266104 | 0.851943 | 0.549124 | 0.493393 |
InvertedIndex{Symbol}(:x2)| Row | x1 | x4 | x5 |
|---|
| Float64 | Float64 | Float64 |
|---|
| 1 | 0.510751 | 0.940862 | 0.0399312 |
| 2 | 0.934257 | 0.807367 | 0.176827 |
| 3 | 0.266104 | 0.549124 | 0.493393 |
| Row | x1 | x2 | x3 |
|---|
| Float64 | Float64 | Float64 |
|---|
| 1 | 0.510751 | 0.126475 | 0.995901 |
| 2 | 0.934257 | 0.0010669 | 0.22795 |
| 3 | 0.266104 | 0.482422 | 0.851943 |
| Row | x4 | x5 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.940862 | 0.0399312 |
| 2 | 0.807367 | 0.176827 |
| 3 | 0.549124 | 0.493393 |
| Row | x2 | x4 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.126475 | 0.940862 |
| 2 | 0.0010669 | 0.807367 |
| 3 | 0.482422 | 0.549124 |
| Row | x1 | x2 | x3 | x4 | x5 |
|---|
| Float64 | Float64 | Float64 | Float64 | Float64 |
|---|
| 1 | 0.510751 | 0.126475 | 0.995901 | 0.940862 | 0.0399312 |
| 2 | 0.934257 | 0.0010669 | 0.22795 | 0.807367 | 0.176827 |
| 3 | 0.266104 | 0.482422 | 0.851943 | 0.549124 | 0.493393 |
Výše uvedené ukázky využívají funkcionalitu z DataFrames.jl balíčku.
"Staré maticové" postupy samozřejmě fungují také, např.:
| Row | x2 | x3 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.126475 | 0.995901 |
| 2 | 0.0010669 | 0.22795 |
| 3 | 0.482422 | 0.851943 |
Přepisování dat
Údaje můžeme měnit několika způsoby.
Vezměme si jednoduchou tabulku:
3×3 Matrix{Any}:
"α" 1 0.5
"β" 2 1.5
"γ" 3 2.5| Row | a | b | c |
|---|
| Any | Any | Any |
|---|
| 1 | α | 1 | 0.5 |
| 2 | β | 2 | 1.5 |
| 3 | γ | 3 | 2.5 |
Změna jedné položky (Ξ je velké ξ):
| Row | a | b | c |
|---|
| Any | Any | Any |
|---|
| 1 | Ξ | 1 | 0.5 |
| 2 | β | 2 | 1.5 |
| 3 | γ | 3 | 2.5 |
Přepsání všech hodnot ve sloupci c:
3-element Vector{String}:
"tau"
"pi"
"omega"| Row | a | b | c |
|---|
| Any | Any | Any |
|---|
| 1 | Ξ | 1 | tau |
| 2 | β | 2 | pi |
| 3 | γ | 3 | omega |
Přepsání jednou hodnotou.
3-element view(::Vector{Any}, :) with eltype Any:
"ř"
"ř"
"ř"| Row | a | b | c |
|---|
| Any | Any | Any |
|---|
| 1 | Ξ | 1 | ř |
| 2 | β | 2 | ř |
| 3 | γ | 3 | ř |
3-element Vector{Any}:
"ř"
"ř"
"ř"3-element Vector{Any}:
"ř"
"pí"
"ř"| Row | a | b | c |
|---|
| Any | Any | Any |
|---|
| 1 | Ξ | 1 | ř |
| 2 | β | 2 | ř |
| 3 | γ | 3 | ř |
3-element Vector{Any}:
"ř"
"ř"
"ř"| Row | a | b | c |
|---|
| Any | Any | Any |
|---|
| 1 | Ξ | 1 | ř |
| 2 | β | 2 | ¿ |
| 3 | γ | 3 | ř |
| Row | a | b | c |
|---|
| Any | Any | Bool? |
|---|
| 1 | Ξ | 1 | true |
| 2 | β | 2 | false |
| 3 | γ | 3 | missing |
Přepsání hodnot ve sloupci a danou hodnotou, ale pouze v řádcích, kde je hodnota ve sloupci b lichá (zde opět používáme indexování bitovým vektorem, nebo "maskování", s kterým jsme se setkali dříve během semestru):
3-element BitVector:
1
0
1
| Row | a | b | c |
|---|
| Any | Any | Bool? |
|---|
| 1 | λ | 1 | true |
| 2 | β | 2 | false |
| 3 | λ | 3 | missing |
| Row | a | b | c |
|---|
| Any | Any | Bool? |
|---|
| 1 | β | 2 | false |
Občas je nutné projít tabulku řádek po řádku.
V tom případě můžeme iterovat přes index.
Například se pokusme změnit první sloupec na řetězec obsahující daný symbol se spodním indexem daným číslem ve sloupci b v LaTeX notaci (tj. λ_1 atd.).
| Row | a | b | c |
|---|
| Any | Any | Bool? |
|---|
| 1 | λ_1 | 1 | true |
| 2 | β_2 | 2 | false |
| 3 | λ_3 | 3 | missing |
DataFrameRow
Row │ a b c
│ Any Any Bool?
─────┼─────────────────
1 │ λ_1 1 true
λ_1
DataFrameRow
Row │ a b c
│ Any Any Bool?
─────┼─────────────────
2 │ β_2 2 false
β_2
DataFrameRow
Row │ a b c
│ Any Any Bool?
─────┼───────────────────
3 │ λ_3 3 missing
λ_3
Zobrazovaní (show, first, last, view)
Pokud je DataFrame příliš velký, tak při jeho zobrazení dojde k ořezání (řádků i sloupců). Toto chování můžeme přebít pomocí parametrů metody show.
U našeho BI-ZMA příkladu je to lehce overkill.
659×17 DataFrame
634 rows omitted
| Row | username | test1 | test2 | test3 | second_chance | activity | tests_total | gitlab | assessment | exam_test | date | oral_exam | veto | points_total | mark | tutor | percentil |
|---|
| String15 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64? | Bool | Int64? | String7? | Float64? | Bool? | Float64? | String1? | String7 | Int64 |
|---|
| 1 | student001 | 3.0 | 11.5 | 13.5 | missing | 5.0 | 28.0 | missing | true | 17 | 13:15 | 34.0 | missing | 67.0 | D | Irena | 58 |
| 2 | student002 | 2.0 | 3.0 | 18.5 | 19.5 | 3.0 | 25.0 | missing | true | 18 | 12:45 | 48.0 | missing | 76.0 | C | Irena | 43 |
| 3 | student003 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Ondra | 6 |
| 4 | student004 | 3.0 | missing | missing | missing | 4.0 | 3.0 | missing | false | missing | missing | missing | missing | missing | missing | Jan V | 13 |
| 5 | student005 | 3.0 | 18.0 | 17.0 | missing | 4.0 | 38.0 | missing | true | 19 | 17:15 | 56.0 | missing | 98.0 | A | Jarda | 93 |
| 6 | student006 | 3.0 | 10.5 | 15.0 | missing | missing | 28.5 | missing | true | 15 | 16:30 | 50.0 | missing | 78.5 | C | Jan S | 59 |
| 7 | student007 | 2.0 | 19.0 | 16.0 | missing | 2.0 | 37.0 | missing | true | 16 | 11:30 | 55.0 | missing | 94.0 | A | Jan V | 91 |
| 8 | student008 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 6 |
| 9 | student009 | 3.0 | 8.0 | 14.0 | missing | 3.0 | 25.0 | missing | true | 18 | 13:15 | 42.0 | missing | 70.0 | C | Irena | 46 |
| 10 | student010 | 1.0 | 6.0 | 7.0 | missing | missing | 14.0 | missing | false | missing | missing | missing | missing | missing | missing | Jan S | 27 |
| 11 | student011 | 3.0 | 14.5 | 19.0 | missing | 4.0 | 36.5 | missing | true | 17 | 15:00 | 57.0 | missing | 97.5 | A | Jakub | 89 |
| 12 | student012 | 3.0 | 18.0 | 15.0 | missing | missing | 36.0 | missing | true | 17 | 11:30 | 54.0 | missing | 90.0 | A | Jan V | 88 |
| 13 | student013 | 2.0 | missing | missing | missing | missing | 2.0 | missing | false | missing | missing | missing | missing | missing | missing | Irena | 11 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 648 | student648 | 1.0 | 13.5 | 14.5 | missing | missing | 29.0 | missing | true | 15 | 11:00 | 45.0 | missing | 74.0 | C | Jan V | 62 |
| 649 | student649 | 1.0 | missing | missing | missing | missing | 1.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 9 |
| 650 | student650 | 2.5 | 1.0 | 1.0 | missing | missing | 4.5 | missing | false | missing | missing | missing | missing | missing | missing | Ondra | 17 |
| 651 | student651 | 0.0 | 8.0 | 8.0 | missing | 1.0 | 16.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 30 |
| 652 | student652 | 2.0 | 5.5 | 12.5 | 12.5 | 1.0 | 20.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 35 |
| 653 | student653 | 3.0 | 8.5 | 17.5 | missing | missing | 29.0 | missing | true | 16 | 14:45 | 55.0 | missing | 84.0 | B | Jitka | 62 |
| 654 | student654 | 1.0 | 14.0 | 15.0 | missing | missing | 30.0 | missing | true | 16 | 14:30 | 50.0 | missing | 80.0 | B | Jan S | 65 |
| 655 | student655 | 1.5 | 5.0 | missing | missing | missing | 6.5 | missing | false | missing | missing | missing | missing | missing | missing | Irena | 19 |
| 656 | student656 | 2.5 | 5.5 | 17.5 | missing | missing | 25.5 | missing | true | 16 | 10:45 | 54.5 | missing | 80.0 | B | Ondra | 49 |
| 657 | student657 | 1.0 | 15.0 | 6.5 | 16.0 | 1.0 | 25.0 | missing | true | 15 | 14:15 | 42.0 | missing | 68.0 | D | Irena | 41 |
| 658 | student658 | 1.0 | 15.0 | 17.5 | missing | 0.5 | 33.5 | missing | true | 17 | 13:45 | 56.0 | missing | 90.0 | A | Jarda | 79 |
| 659 | student659 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Jakub | 6 |
659×17 DataFrame
Row │ username test1 test2 test3 second_chance activity ⋯
│ String15 Float64? Float64? Float64? Float64? Float64? ⋯
─────┼──────────────────────────────────────────────────────────────────────────
1 │ student001 3.0 11.5 13.5 missing 5.0 ⋯
2 │ student002 2.0 3.0 18.5 19.5 3.0
3 │ student003 missing missing missing missing missing
4 │ student004 3.0 missing missing missing 4.0
5 │ student005 3.0 18.0 17.0 missing 4.0 ⋯
6 │ student006 3.0 10.5 15.0 missing missing
7 │ student007 2.0 19.0 16.0 missing 2.0
8 │ student008 missing missing missing missing missing
9 │ student009 3.0 8.0 14.0 missing 3.0 ⋯
10 │ student010 1.0 6.0 7.0 missing missing
11 │ student011 3.0 14.5 19.0 missing 4.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
650 │ student650 2.5 1.0 1.0 missing missing
651 │ student651 0.0 8.0 8.0 missing 1.0 ⋯
652 │ student652 2.0 5.5 12.5 12.5 1.0
653 │ student653 3.0 8.5 17.5 missing missing
654 │ student654 1.0 14.0 15.0 missing missing
655 │ student655 1.5 5.0 missing missing missing ⋯
656 │ student656 2.5 5.5 17.5 missing missing
657 │ student657 1.0 15.0 6.5 16.0 1.0
658 │ student658 1.0 15.0 17.5 missing 0.5
659 │ student659 missing missing missing missing missing ⋯
11 columns and 638 rows omitted659×17 DataFrame
Row │ username test1 test2 test3 second_chance activity tests_total gitlab assessment exam_test date oral_exam veto points_total mark tutor percentil
│ String15 Float64? Float64? Float64? Float64? Float64? Float64 Float64? Bool Int64? String7? Float64? Bool? Float64? String1? String7 Int64
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ student001 3.0 11.5 13.5 missing 5.0 28.0 missing true 17 13:15 34.0 missing 67.0 D Irena 58
2 │ student002 2.0 3.0 18.5 19.5 3.0 25.0 missing true 18 12:45 48.0 missing 76.0 C Irena 43
3 │ student003 missing missing missing missing missing 0.0 missing false missing missing missing missing missing missing Ondra 6
4 │ student004 3.0 missing missing missing 4.0 3.0 missing false missing missing missing missing missing missing Jan V 13
5 │ student005 3.0 18.0 17.0 missing 4.0 38.0 missing true 19 17:15 56.0 missing 98.0 A Jarda 93
6 │ student006 3.0 10.5 15.0 missing missing 28.5 missing true 15 16:30 50.0 missing 78.5 C Jan S 59
7 │ student007 2.0 19.0 16.0 missing 2.0 37.0 missing true 16 11:30 55.0 missing 94.0 A Jan V 91
8 │ student008 missing missing missing missing missing 0.0 missing false missing missing missing missing missing missing Jitka 6
9 │ student009 3.0 8.0 14.0 missing 3.0 25.0 missing true 18 13:15 42.0 missing 70.0 C Irena 46
10 │ student010 1.0 6.0 7.0 missing missing 14.0 missing false missing missing missing missing missing missing Jan S 27
11 │ student011 3.0 14.5 19.0 missing 4.0 36.5 missing true 17 15:00 57.0 missing 97.5 A Jakub 89
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
650 │ student650 2.5 1.0 1.0 missing missing 4.5 missing false missing missing missing missing missing missing Ondra 17
651 │ student651 0.0 8.0 8.0 missing 1.0 16.0 missing false missing missing missing missing missing missing Jitka 30
652 │ student652 2.0 5.5 12.5 12.5 1.0 20.0 missing false missing missing missing missing missing missing Jitka 35
653 │ student653 3.0 8.5 17.5 missing missing 29.0 missing true 16 14:45 55.0 missing 84.0 B Jitka 62
654 │ student654 1.0 14.0 15.0 missing missing 30.0 missing true 16 14:30 50.0 missing 80.0 B Jan S 65
655 │ student655 1.5 5.0 missing missing missing 6.5 missing false missing missing missing missing missing missing Irena 19
656 │ student656 2.5 5.5 17.5 missing missing 25.5 missing true 16 10:45 54.5 missing 80.0 B Ondra 49
657 │ student657 1.0 15.0 6.5 16.0 1.0 25.0 missing true 15 14:15 42.0 missing 68.0 D Irena 41
658 │ student658 1.0 15.0 17.5 missing 0.5 33.5 missing true 17 13:45 56.0 missing 90.0 A Jarda 79
659 │ student659 missing missing missing missing missing 0.0 missing false missing missing missing missing missing missing Jakub 6
638 rows omittedToto by byl ještě větší overkill.
Případně můžeme prozkoumávat začátek a konec tabulky.
| Row | username | test1 | test2 | test3 | second_chance | activity | tests_total | gitlab | assessment | exam_test | date | oral_exam | veto | points_total | mark | tutor | percentil |
|---|
| String15 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64? | Bool | Int64? | String7? | Float64? | Bool? | Float64? | String1? | String7 | Int64 |
|---|
| 1 | student001 | 3.0 | 11.5 | 13.5 | missing | 5.0 | 28.0 | missing | true | 17 | 13:15 | 34.0 | missing | 67.0 | D | Irena | 58 |
| 2 | student002 | 2.0 | 3.0 | 18.5 | 19.5 | 3.0 | 25.0 | missing | true | 18 | 12:45 | 48.0 | missing | 76.0 | C | Irena | 43 |
| 3 | student003 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Ondra | 6 |
| 4 | student004 | 3.0 | missing | missing | missing | 4.0 | 3.0 | missing | false | missing | missing | missing | missing | missing | missing | Jan V | 13 |
| 5 | student005 | 3.0 | 18.0 | 17.0 | missing | 4.0 | 38.0 | missing | true | 19 | 17:15 | 56.0 | missing | 98.0 | A | Jarda | 93 |
| 6 | student006 | 3.0 | 10.5 | 15.0 | missing | missing | 28.5 | missing | true | 15 | 16:30 | 50.0 | missing | 78.5 | C | Jan S | 59 |
| 7 | student007 | 2.0 | 19.0 | 16.0 | missing | 2.0 | 37.0 | missing | true | 16 | 11:30 | 55.0 | missing | 94.0 | A | Jan V | 91 |
| 8 | student008 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 6 |
| 9 | student009 | 3.0 | 8.0 | 14.0 | missing | 3.0 | 25.0 | missing | true | 18 | 13:15 | 42.0 | missing | 70.0 | C | Irena | 46 |
| 10 | student010 | 1.0 | 6.0 | 7.0 | missing | missing | 14.0 | missing | false | missing | missing | missing | missing | missing | missing | Jan S | 27 |
| Row | username | test1 | test2 | test3 | second_chance | activity | tests_total | gitlab | assessment | exam_test | date | oral_exam | veto | points_total | mark | tutor | percentil |
|---|
| String15 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64? | Bool | Int64? | String7? | Float64? | Bool? | Float64? | String1? | String7 | Int64 |
|---|
| 1 | student650 | 2.5 | 1.0 | 1.0 | missing | missing | 4.5 | missing | false | missing | missing | missing | missing | missing | missing | Ondra | 17 |
| 2 | student651 | 0.0 | 8.0 | 8.0 | missing | 1.0 | 16.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 30 |
| 3 | student652 | 2.0 | 5.5 | 12.5 | 12.5 | 1.0 | 20.0 | missing | false | missing | missing | missing | missing | missing | missing | Jitka | 35 |
| 4 | student653 | 3.0 | 8.5 | 17.5 | missing | missing | 29.0 | missing | true | 16 | 14:45 | 55.0 | missing | 84.0 | B | Jitka | 62 |
| 5 | student654 | 1.0 | 14.0 | 15.0 | missing | missing | 30.0 | missing | true | 16 | 14:30 | 50.0 | missing | 80.0 | B | Jan S | 65 |
| 6 | student655 | 1.5 | 5.0 | missing | missing | missing | 6.5 | missing | false | missing | missing | missing | missing | missing | missing | Irena | 19 |
| 7 | student656 | 2.5 | 5.5 | 17.5 | missing | missing | 25.5 | missing | true | 16 | 10:45 | 54.5 | missing | 80.0 | B | Ondra | 49 |
| 8 | student657 | 1.0 | 15.0 | 6.5 | 16.0 | 1.0 | 25.0 | missing | true | 15 | 14:15 | 42.0 | missing | 68.0 | D | Irena | 41 |
| 9 | student658 | 1.0 | 15.0 | 17.5 | missing | 0.5 | 33.5 | missing | true | 17 | 13:45 | 56.0 | missing | 90.0 | A | Jarda | 79 |
| 10 | student659 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | Jakub | 6 |
Pro rychlé prozkoumávání částí tabulky slouží metoda view, resp. makro @view, která jen část tabulky zobrazí.
Nevytváří nový objekt, měla by být efektivnější.
51×3 SubDataFrame
26 rows omitted
| Row | username | points_total | mark |
|---|
| String15 | Float64? | String1? |
|---|
| 1 | student100 | 91.5 | A |
| 2 | student101 | 80.5 | B |
| 3 | student102 | missing | missing |
| 4 | student103 | 73.0 | C |
| 5 | student104 | 75.5 | C |
| 6 | student105 | missing | missing |
| 7 | student106 | 81.5 | B |
| 8 | student107 | 76.0 | C |
| 9 | student108 | missing | missing |
| 10 | student109 | 76.5 | C |
| 11 | student110 | missing | missing |
| 12 | student111 | 57.0 | E |
| 13 | student112 | missing | missing |
| ⋮ | ⋮ | ⋮ | ⋮ |
| 40 | student139 | 89.0 | B |
| 41 | student140 | 74.0 | C |
| 42 | student141 | 90.0 | A |
| 43 | student142 | 64.5 | D |
| 44 | student143 | missing | missing |
| 45 | student144 | missing | missing |
| 46 | student145 | 78.0 | C |
| 47 | student146 | missing | missing |
| 48 | student147 | 80.5 | B |
| 49 | student148 | missing | F |
| 50 | student149 | missing | missing |
| 51 | student150 | 100.5 | A |
| Row | username | mark |
|---|
| String15 | String1? |
|---|
| 1 | student010 | missing |
| 2 | student011 | A |
| 3 | student012 | A |
| 4 | student013 | missing |
| 5 | student014 | missing |
| 6 | student015 | D |
Přidávání a odebírání sloupců
K přidávání sloupců slouží metoda insertcols!:
| Row | x1 | x2 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.0847751 | 0.0320252 |
| 2 | 0.592521 | 0.457126 |
| Row | x1 | x2 | a |
|---|
| Float64 | Float64 | Int64 |
|---|
| 1 | 0.0847751 | 0.0320252 | 1 |
| 2 | 0.592521 | 0.457126 | 2 |
| Row | a | x1 | x2 |
|---|
| Irration… | Float64 | Float64 |
|---|
| 1 | π | 0.0847751 | 0.0320252 |
| 2 | π | 0.592521 | 0.457126 |
| Row | a2 | a | x1 | x2 |
|---|
| Float64 | Irration… | Float64 | Float64 |
|---|
| 1 | 3.14159 | π | 0.0847751 | 0.0320252 |
| 2 | -3.14159 | π | 0.592521 | 0.457126 |
| Row | a2 | a | b | x1 | x2 |
|---|
| Float64 | Irration… | String | Float64 | Float64 |
|---|
| 1 | 3.14159 | π | ⊕ | 0.0847751 | 0.0320252 |
| 2 | -3.14159 | π | ⊕ | 0.592521 | 0.457126 |
| Row | a2 | a | b | x1 | c | x2 |
|---|
| Float64 | Irration… | String | Float64 | Int64 | Float64 |
|---|
| 1 | 3.14159 | π | ⊕ | 0.0847751 | 1 | 0.0320252 |
| 2 | -3.14159 | π | ⊕ | 0.592521 | 2 | 0.457126 |
Ale lze použít i prosté indexování:
| Row | a2 | a | b | x1 | c | x2 | d |
|---|
| Float64 | Irration… | String | Float64 | Int64 | Float64 | Int64 |
|---|
| 1 | 3.14159 | π | ⊕ | 0.0847751 | 1 | 0.0320252 | 42 |
| 2 | -3.14159 | π | ⊕ | 0.592521 | 2 | 0.457126 | 42 |
Mazání sloupců musíme provést pomocí indexace, neexistuje metoda "dropcolumns!".
| Row | a2 | a | x1 | c | x2 | d |
|---|
| Float64 | Irration… | Float64 | Int64 | Float64 | Int64 |
|---|
| 1 | 3.14159 | π | 0.0847751 | 1 | 0.0320252 | 42 |
| 2 | -3.14159 | π | 0.592521 | 2 | 0.457126 | 42 |
Přidávání a odebírání řádků
Vytvořme si zase testovací tabulku:
| Row | x1 | x2 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
Přidat řádek lze opět několika způsoby. Nejpřirozenější je asi metoda push!:
| Row | x1 | x2 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
| 3 | 1.0 | 2.0 |
| Row | x1 | x2 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
| 3 | 1.0 | 2.0 |
| 4 | 1.0 | 2.0 |
| Row | x1 | x2 |
|---|
| Float64 | Float64 |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
| 3 | 1.0 | 2.0 |
| 4 | 1.0 | 2.0 |
| 5 | 0.5 | 2.3 |
┌ Error: Error adding value to column :x1. Maybe you forgot passing `promote=true`?
└ @ DataFrames ~/.julia/packages/DataFrames/b4w9K/src/dataframe/insertion.jl:810
MethodError: Cannot `convert` an object of type String to an object of type Float64
The function `convert` exists, but no method is defined for this combination of argument types.
Closest candidates are:
convert(::Type{T}, ::T) where T<:Number
@ Base number.jl:6
convert(::Type{T}, ::Number) where T<:Number
@ Base number.jl:7
convert(::Type{T}, ::T) where T
@ Base Base_compiler.jl:133
...
Stacktrace:
[1] push!(a::Vector{Float64}, item::String)
@ Base ./array.jl:1285
[2] _row_inserter!(df::DataFrame, loc::Int64, row::Vector{String}, mode::Val{:push}, promote::Bool)
@ DataFrames ~/.julia/packages/DataFrames/b4w9K/src/dataframe/insertion.jl:776
[3] #push!#383
@ ~/.julia/packages/DataFrames/b4w9K/src/dataframe/insertion.jl:545 [inlined]
[4] push!(df::DataFrame, row::Vector{String})
@ DataFrames ~/.julia/packages/DataFrames/b4w9K/src/dataframe/insertion.jl:538
[5] top-level scope
@ In[54]:1
[6] eval(m::Module, e::Any)
@ Core ./boot.jl:489| Row | x1 | x2 |
|---|
| Any | Any |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
| 3 | 1.0 | 2.0 |
| 4 | 1.0 | 2.0 |
| 5 | 0.5 | 2.3 |
| 6 | a | b |
| Row | x1 | x2 |
|---|
| Any | Any |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
| 3 | 1.0 | 2.0 |
| 4 | 1.0 | 2.0 |
| 5 | 0.5 | 2.3 |
| 6 | a | b |
| 7 | 0.3 | 0.5 |
| Row | x1 | x2 |
|---|
| Any | Any |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
| 3 | 1.0 | 2.0 |
| 4 | 1.0 | 2.0 |
| 5 | 0.5 | 2.3 |
| 6 | a | b |
| 7 | 0.3 | 0.5 |
| 8 | 5 | 7 |
Pomocí append! můžeme spojovat několik tabulek "vertikálně" dohromady:
| Row | x1 | x2 |
|---|
| Any | Any |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
| 3 | 1.0 | 2.0 |
| 4 | 1.0 | 2.0 |
| 5 | 0.5 | 2.3 |
| 6 | a | b |
| 7 | 0.3 | 0.5 |
| 8 | 5 | 7 |
| 9 | 0.840783 | 0.0684555 |
| 10 | 0.40825 | 0.53635 |
| 11 | 0.400456 | 0.684138 |
| 12 | 0.162501 | 0.0953885 |
| 13 | 0.0929935 | 0.749572 |
| 14 | 0.779738 | 0.450801 |
| 15 | 0.855484 | 0.167365 |
| 16 | 0.33831 | 0.0827383 |
| 17 | 0.701541 | 0.567536 |
| 18 | 0.0982674 | 0.443836 |
Mazat řádky můžeme explicitně pomocí delete!:
| Row | x1 | x2 |
|---|
| Any | Any |
|---|
| 1 | 0.425608 | 0.338276 |
| 2 | 0.450723 | 0.315649 |
| 3 | 1.0 | 2.0 |
| 4 | 0.5 | 2.3 |
| 5 | a | b |
| 6 | 0.3 | 0.5 |
| 7 | 5 | 7 |
| 8 | 0.840783 | 0.0684555 |
| 9 | 0.40825 | 0.53635 |
| 10 | 0.400456 | 0.684138 |
| 11 | 0.162501 | 0.0953885 |
| 12 | 0.0929935 | 0.749572 |
| 13 | 0.779738 | 0.450801 |
| 14 | 0.855484 | 0.167365 |
| 15 | 0.33831 | 0.0827383 |
| 16 | 0.701541 | 0.567536 |
| 17 | 0.0982674 | 0.443836 |
| Row | x1 | x2 |
|---|
| Any | Any |
|---|
| 1 | 0.5 | 2.3 |
| 2 | a | b |
| 3 | 0.3 | 0.5 |
| 4 | 5 | 7 |
| 5 | 0.840783 | 0.0684555 |
| 6 | 0.40825 | 0.53635 |
| 7 | 0.400456 | 0.684138 |
| 8 | 0.162501 | 0.0953885 |
| 9 | 0.0929935 | 0.749572 |
| 10 | 0.779738 | 0.450801 |
| 11 | 0.855484 | 0.167365 |
| 12 | 0.33831 | 0.0827383 |
| 13 | 0.701541 | 0.567536 |
| 14 | 0.0982674 | 0.443836 |
| Row | x1 | x2 |
|---|
| Any | Any |
|---|
| 1 | 0.5 | 2.3 |
| 2 | 0.3 | 0.5 |
| 3 | 5 | 7 |
| 4 | 0.840783 | 0.0684555 |
| 5 | 0.40825 | 0.53635 |
| 6 | 0.400456 | 0.684138 |
| 7 | 0.162501 | 0.0953885 |
| 8 | 0.0929935 | 0.749572 |
| 9 | 0.779738 | 0.450801 |
| 10 | 0.855484 | 0.167365 |
| 11 | 0.33831 | 0.0827383 |
| 12 | 0.701541 | 0.567536 |
| 13 | 0.0982674 | 0.443836 |
13-element BitVector:
0
0
1
1
0
0
0
0
1
1
0
1
0
| Row | x1 | x2 |
|---|
| Any | Any |
|---|
| 1 | 0.5 | 2.3 |
| 2 | 0.3 | 0.5 |
| 3 | 0.40825 | 0.53635 |
| 4 | 0.400456 | 0.684138 |
| 5 | 0.162501 | 0.0953885 |
| 6 | 0.0929935 | 0.749572 |
| 7 | 0.33831 | 0.0827383 |
| 8 | 0.0982674 | 0.443836 |
To zdaleka není všechno
Výše jsme shrnuli zřejmě ty nejužitečnější metody a techniky.
Tím ale možnosti DataFrames.jl zdaleka nekončí.
Zvídavému čtenáři doporučujeme prolétnout dokumentaci.
Na tomto místě snad jen upozorněme na následující:
Velmi sofistikované transformace dat v tabulkách lze také provádět pomocí následujících metod:
combine
select/select!
transform/transform!
Jejich podrobný výklad je už nad rámec tohoto kurzu.
2. Cvičení: analýza BI-MA1 v B212, B222, B232
Pojďme se podrobněji podívat na data z předmětu BI-MA1.
Nejprve znovu načteme data z CSV souboru, který je syrovým exportem z Grades, kde jsou anonymizováni uživatelská jména studentů.
| Row | variable | mean | min | median | max | nmissing | eltype |
|---|
| Symbol | Union… | Any | Union… | Any | Int64 | Type |
|---|
| 1 | username | | student001 | | student528 | 0 | String15 |
| 2 | test1 | 8.45199 | 0.0 | 8.0 | 20.0 | 101 | Union{Missing, Float64} |
| 3 | test2 | 13.3188 | 0.0 | 14.0 | 20.0 | 219 | Union{Missing, Float64} |
| 4 | second_chance | 15.0588 | 0.0 | 15.0 | 23.5 | 460 | Union{Missing, Float64} |
| 5 | activity | 1.95629 | 0.0 | 2.0 | 5.0 | 242 | Union{Missing, Float64} |
| 6 | tests_total | 14.928 | 0.0 | 15.5 | 39.5 | 0 | Float64 |
| 7 | gitlab | 1.30769 | 0.0 | 1.0 | 7.0 | 489 | Union{Missing, Float64} |
| 8 | assessment | 0.395833 | false | 0.0 | true | 0 | Bool |
| 9 | exam_test | 16.7081 | 0 | 17.0 | 20 | 319 | Union{Missing, Int64} |
| 10 | date | | --- | | 16:50 | 319 | Union{Missing, String7} |
| 11 | oral_exam | 48.3026 | 28.0 | 50.0 | 60.0 | 338 | Union{Missing, Float64} |
| 12 | veto | 1.0 | true | 1.0 | true | 525 | Union{Missing, Bool} |
| 13 | points_total | 79.6217 | 55.0 | 80.0 | 103.5 | 339 | Union{Missing, Float64} |
| 14 | mark | | A | | F | 319 | Union{Missing, String1} |
| 15 | tutor | | Honza | | Tomáš | 8 | Union{Missing, String7} |
| 16 | percentil | 53.1136 | 23 | 50.0 | 100 | 0 | Int64 |
| Row | variable | mean | min | median | max | nmissing | eltype |
|---|
| Symbol | Union… | Any | Union… | Any | Int64 | Type |
|---|
| 1 | username | | student0001 | | student0760 | 0 | String15 |
| 2 | test1 | 1.69818 | 0.0 | 1.5 | 3.0 | 99 | Union{Missing, Float64} |
| 3 | test2 | 10.6672 | 0.0 | 11.0 | 20.0 | 132 | Union{Missing, Float64} |
| 4 | test3 | 12.6893 | 0.0 | 13.5 | 20.0 | 245 | Union{Missing, Float64} |
| 5 | second_chance | 14.5402 | 5.0 | 15.0 | 22.0 | 673 | Union{Missing, Float64} |
| 6 | activity | 2.27454 | -6.0 | 2.0 | 5.0 | 383 | Union{Missing, Float64} |
| 7 | tests_total | 19.1148 | 0.0 | 24.5 | 43.0 | 2 | Union{Missing, Float64} |
| 8 | gitlab | 2.23684 | 1 | 1.0 | 12 | 722 | Union{Missing, Int64} |
| 9 | assessment | 0.497361 | false | 0.0 | true | 2 | Union{Missing, Bool} |
| 10 | exam_test | 16.4377 | -1 | 17.0 | 20 | 383 | Union{Missing, Int64} |
| 11 | date | | --- | | CT 14:30 | 383 | Union{Missing, String15} |
| 12 | oral_exam | 46.8988 | 20.0 | 48.0 | 57.0 | 419 | Union{Missing, Float64} |
| 13 | veto | 1.0 | true | 1.0 | true | 757 | Union{Missing, Bool} |
| 14 | points_total | 80.3235 | 54.5 | 80.5 | 105.0 | 420 | Union{Missing, Float64} |
| 15 | mark | | A | | F | 383 | Union{Missing, String1} |
| 16 | tutor | | Irena | | Petr | 2 | Union{Missing, String7} |
| 17 | percentil | 50.9513 | 12 | 50.0 | 100 | 0 | Int64 |
| Row | variable | mean | min | median | max | nmissing | eltype |
|---|
| Symbol | Union… | Any | Union… | Any | Int64 | Type |
|---|
| 1 | username | | student001 | | student659 | 0 | String15 |
| 2 | test1 | 2.02801 | 0.0 | 2.0 | 3.0 | 70 | Union{Missing, Float64} |
| 3 | test2 | 11.5804 | 0.0 | 12.0 | 20.0 | 68 | Union{Missing, Float64} |
| 4 | test3 | 13.5825 | 0.0 | 14.5 | 20.0 | 150 | Union{Missing, Float64} |
| 5 | second_chance | 12.6964 | 2.5 | 14.0 | 22.5 | 589 | Union{Missing, Float64} |
| 6 | activity | 2.38564 | 0.0 | 2.0 | 5.0 | 334 | Union{Missing, Float64} |
| 7 | tests_total | 22.8437 | 0.0 | 26.5 | 43.0 | 0 | Float64 |
| 8 | gitlab | 0.814815 | 0.5 | 0.5 | 4.0 | 632 | Union{Missing, Float64} |
| 9 | assessment | 0.616085 | false | 1.0 | true | 0 | Bool |
| 10 | exam_test | 16.9113 | -1 | 17.0 | 20 | 253 | Union{Missing, Int64} |
| 11 | date | | --- | | 17:30 | 254 | Union{Missing, String7} |
| 12 | oral_exam | 47.7257 | 0.0 | 49.0 | 57.0 | 278 | Union{Missing, Float64} |
| 13 | veto | 1.0 | true | 1.0 | true | 657 | Union{Missing, Bool} |
| 14 | points_total | 81.5019 | 55.0 | 81.5 | 105.0 | 279 | Union{Missing, Float64} |
| 15 | mark | | A | | F | 253 | Union{Missing, String1} |
| 16 | tutor | | Irena | | Ondra | 0 | String7 |
| 17 | percentil | 50.3612 | 0 | 50.0 | 100 | 0 | Int64 |
| Row | variable | mean | min | median | max | nmissing | eltype |
|---|
| Symbol | Union… | Any | Union… | Any | Int64 | Type |
|---|
| 1 | username | | student000 | | student569 | 0 | String15 |
| 2 | test1 | 2.04199 | 0.0 | 2.5 | 3.0 | 58 | Union{Missing, Float64} |
| 3 | test2 | 11.5404 | 0.0 | 12.5 | 20.0 | 62 | Union{Missing, Float64} |
| 4 | test3 | 13.4211 | 0.0 | 13.75 | 20.0 | 120 | Union{Missing, Float64} |
| 5 | second_chance | 13.4273 | 5.0 | 14.0 | 21.5 | 515 | Union{Missing, Float64} |
| 6 | activity | 2.61856 | 0.0 | 2.5 | 5.0 | 242 | Union{Missing, Float64} |
| 7 | tests_total | 22.8987 | 0.0 | 27.0 | 43.0 | 0 | Float64 |
| 8 | gitlab | 0.977273 | 0.5 | 1.0 | 2.5 | 548 | Union{Missing, Float64} |
| 9 | assessment | 0.640351 | false | 1.0 | true | 0 | Bool |
| 10 | elsa | | + 25 % | | úraz ruky + čas (semestr?) | 557 | Union{Missing, String31} |
| 11 | exam_test | 15.9038 | -1 | 17.0 | 20 | 206 | Union{Missing, Int64} |
| 12 | date | | --- | | 2025-06-30 | 206 | Union{Missing, String15} |
| 13 | oral_exam | 46.8938 | 15.0 | 49.0 | 59.0 | 264 | Union{Missing, Float64} |
| 14 | veto | 1.0 | true | 1.0 | true | 565 | Union{Missing, Bool} |
| 15 | points_total | 80.6416 | 58.0 | 80.5 | 104.0 | 266 | Union{Missing, Float64} |
| 16 | mark | | A | | F | 206 | Union{Missing, String1} |
| 17 | tutor | | Eva | | Ondra | 37 | Union{Missing, String7} |
| 18 | percentil | 50.5895 | 8 | 52.0 | 100 | 0 | Int64 |
2.1 Základní údaje
Nejprve prozkoumejte dostupné sloupce a pokuste se zjistit základní údaje jako
- Kolik studentů mělo předmět zapsáno?
- Kolik studentů získalo zápočet a dokončilo předmět?
- Kolik studentů bylo v jakém ročníku? (V prvním běhu BI-MA1 toto nemá smysl.)
- ...
Kolik studentů má zápočet?
758-element Vector{Union{Missing, Bool}}:
1
1
0
0
0
0
1
1
1
1
0
1
0
⋮
1
0
0
1
1
1
1
0
1
1
0
1Úspěšně dokončilo předmět:
Počet studentů, kteří získali zápočet a současně neuspěli u zkoušky.
2.2 Zápočtové písemky, kvízy a semestr
Poté se podívejme podrobněji jak probíhalo získávání zápočtů.
- Prozkoumejte výsledky jednotlivých zápočtových písemek, vytvořte historgramy výsledků.
- Kolika studentům zápočet "těsně" unikl?
- Kolik studentů "odpadlo" už v první polovině semestru?
- ...
Histogram prvního zápočtové testu.
Pozor, se samotným skipmissing nefunguje.
skipmissing(Union{Missing, Float64}[3.0, 2.0, missing, 3.0, 3.0, 3.0, 2.0, missing, 3.0, 1.0 … 2.5, 0.0, 2.0, 3.0, 1.0, 1.5, 2.5, 1.0, 1.0, missing])589-element Vector{Float64}:
3.0
2.0
3.0
3.0
3.0
2.0
3.0
1.0
3.0
3.0
2.0
2.5
2.5
⋮
3.0
1.0
1.0
2.5
0.0
2.0
3.0
1.0
1.5
2.5
1.0
1.0Poslední běh lze porovnat s předposledním během!
Histogram druhého zápočtového testu.
Počet studentů píšících testy v posledním běhu.
Počet studentů, kteří získali zápočet bez opravné zápočtové písemky.
Počet studentů s alespoň 20 body ze zápočtových písemek.
Histogram opravné zápočtové písemky.
Histogram celkového počtu bodů ze semestru.
Úspěšnost zisku zápočtu podle cvičících.
GroupedDataFrame with 10 groups based on key: tutor
First Group (46 rows): tutor = "Irena"
21 rows omitted
| Row | username | test1 | test2 | test3 | second_chance | activity | tests_total | gitlab | assessment | elsa | exam_test | date | oral_exam | veto | points_total | mark | tutor | percentil |
|---|
| String15 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64? | Bool | String31? | Int64? | String15? | Float64? | Bool? | Float64? | String1? | String7? | Int64 |
|---|
| 1 | student000 | 0.0 | 7.5 | 10.0 | missing | missing | 17.5 | missing | false | missing | missing | missing | missing | missing | missing | missing | Irena | 29 |
| 2 | student004 | 3.0 | 5.0 | missing | missing | missing | 8.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Irena | 20 |
| 3 | student009 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Irena | 8 |
| 4 | student016 | 2.0 | 7.0 | 6.5 | missing | 2.0 | 15.5 | missing | false | missing | missing | missing | missing | missing | missing | missing | Irena | 27 |
| 5 | student038 | 2.0 | 4.0 | 0.0 | missing | missing | 6.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Irena | 18 |
| 6 | student052 | 3.0 | 17.5 | 12.0 | missing | 5.0 | 32.5 | missing | true | missing | 17 | 14:00 | 49.0 | missing | 86.5 | B | Irena | 76 |
| 7 | student060 | 3.0 | 13.5 | 10.5 | missing | missing | 27.0 | missing | true | missing | 17 | 13:30 | 46.0 | missing | 73.0 | C | Irena | 52 |
| 8 | student077 | 3.0 | 2.5 | 5.5 | missing | missing | 11.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Irena | 23 |
| 9 | student085 | 3.0 | 6.0 | 20.0 | missing | 5.0 | 29.0 | missing | true | missing | 10 | --- | missing | missing | missing | F | Irena | 61 |
| 10 | student103 | 3.0 | 15.0 | 12.5 | missing | missing | 30.5 | missing | true | missing | 19 | 14:00 | 57.0 | missing | 87.5 | B | Irena | 66 |
| 11 | student111 | 3.0 | 18.5 | 13.0 | missing | 4.0 | 34.5 | missing | true | missing | 13 | --- | missing | missing | missing | F | Irena | 84 |
| 12 | student115 | 3.0 | 11.5 | 13.5 | missing | 5.0 | 28.0 | missing | true | missing | 15 | 11:00 | 49.0 | missing | 82.0 | B | Irena | 57 |
| 13 | student117 | 3.0 | 11.0 | 12.5 | missing | 5.0 | 26.5 | missing | true | missing | 15 | 13:15 | 53.0 | missing | 84.5 | B | Irena | 49 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 35 | student424 | 3.0 | 18.0 | 11.5 | missing | 5.0 | 32.5 | missing | true | missing | 19 | 12:30 | 56.0 | missing | 93.5 | A | Irena | 76 |
| 36 | student429 | 3.0 | 15.0 | 13.0 | missing | 5.0 | 31.0 | missing | true | missing | 17 | 12:15 | 37.0 | missing | 73.0 | C | Irena | 69 |
| 37 | student433 | 3.0 | 17.0 | 13.5 | missing | 5.0 | 33.5 | missing | true | missing | 15 | 13:30 | 52.0 | missing | 90.5 | A | Irena | 80 |
| 38 | student438 | 3.0 | 17.0 | 11.0 | missing | 5.0 | 31.0 | missing | true | missing | 20 | 12:30 | 39.0 | missing | 75.0 | C | Irena | 69 |
| 39 | student447 | 2.0 | 6.0 | 10.0 | missing | missing | 18.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Irena | 30 |
| 40 | student471 | 3.0 | 9.0 | 9.5 | 18.0 | 5.0 | 25.0 | missing | true | missing | 16 | 12:15 | 56.0 | missing | 86.0 | B | Irena | 36 |
| 41 | student507 | 2.0 | 10.5 | 9.5 | missing | 4.0 | 22.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Irena | 37 |
| 42 | student509 | 3.0 | 15.0 | 16.0 | missing | 5.0 | 34.0 | missing | true | missing | -1 | --- | missing | missing | missing | F | Irena | 82 |
| 43 | student531 | 3.0 | 16.5 | 11.5 | missing | 5.0 | 31.0 | missing | true | missing | 15 | 10:45 | 51.0 | missing | 87.0 | B | Irena | 69 |
| 44 | student535 | 1.5 | 13.5 | 10.5 | missing | 0.0 | 25.5 | 1.0 | true | + 25 % | 20 | 14:15 | 38.0 | missing | 64.5 | D | Irena | 46 |
| 45 | student546 | 3.0 | 8.0 | 19.5 | missing | 5.0 | 30.5 | missing | true | missing | 19 | 14:00 | 56.0 | missing | 91.5 | A | Irena | 66 |
| 46 | student556 | 3.0 | 16.5 | 16.0 | missing | 2.0 | 35.5 | missing | true | missing | 10 | --- | missing | missing | missing | F | Irena | 89 |
⋮
Last Group (29 rows): tutor = "Jitka"
4 rows omitted
| Row | username | test1 | test2 | test3 | second_chance | activity | tests_total | gitlab | assessment | elsa | exam_test | date | oral_exam | veto | points_total | mark | tutor | percentil |
|---|
| String15 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64? | Bool | String31? | Int64? | String15? | Float64? | Bool? | Float64? | String1? | String7? | Int64 |
|---|
| 1 | student037 | 3.0 | 7.5 | 11.0 | 21.5 | 2.0 | 25.0 | missing | true | missing | 16 | 12:15 | 42.0 | missing | 69.0 | D | Jitka | 36 |
| 2 | student050 | 2.0 | 17.5 | 19.0 | missing | 4.0 | 38.5 | missing | true | missing | 20 | 12:45 | 53.0 | missing | 95.5 | A | Jitka | 95 |
| 3 | student054 | 3.0 | 13.5 | 19.0 | missing | missing | 35.5 | missing | true | missing | 18 | 12:30 | 53.0 | missing | 88.5 | B | Jitka | 89 |
| 4 | student075 | 1.5 | 12.5 | 20.0 | missing | 1.0 | 34.0 | missing | true | missing | 15 | 12:30 | 46.0 | missing | 81.0 | B | Jitka | 82 |
| 5 | student121 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Jitka | 8 |
| 6 | student152 | 1.5 | 7.5 | 8.5 | missing | 1.0 | 17.5 | missing | false | missing | missing | missing | missing | missing | missing | missing | Jitka | 29 |
| 7 | student169 | 2.5 | 9.0 | 14.0 | missing | 3.0 | 25.5 | missing | true | missing | 15 | 13:00 | 15.0 | missing | missing | F | Jitka | 46 |
| 8 | student177 | 3.0 | 13.5 | 17.0 | missing | 2.0 | 33.5 | missing | true | missing | 16 | 12:30 | 54.5 | missing | 90.0 | A | Jitka | 80 |
| 9 | student187 | 3.0 | 15.0 | 13.5 | missing | 1.0 | 31.5 | missing | true | missing | 17 | 10:30 | 53.0 | missing | 85.5 | B | Jitka | 71 |
| 10 | student189 | 3.0 | 10.5 | 12.0 | missing | 1.0 | 25.5 | missing | true | missing | 19 | 10:15 | 44.0 | missing | 70.5 | C | Jitka | 46 |
| 11 | student241 | 2.0 | 19.0 | 16.5 | missing | 4.0 | 37.5 | missing | true | missing | 17 | 10:30 | 53.0 | missing | 94.5 | A | Jitka | 93 |
| 12 | student266 | 2.5 | 6.5 | 15.5 | 9.5 | 1.5 | 24.5 | missing | false | missing | missing | missing | missing | missing | missing | missing | Jitka | 41 |
| 13 | student305 | 2.0 | 16.0 | 9.5 | missing | 1.5 | 27.5 | missing | true | missing | 16 | 12:15 | 47.0 | missing | 76.0 | C | Jitka | 54 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 18 | student385 | 1.0 | 9.0 | 9.0 | missing | 2.0 | 19.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Jitka | 31 |
| 19 | student404 | 1.5 | 6.5 | 5.5 | missing | missing | 13.5 | missing | false | missing | missing | missing | missing | missing | missing | missing | Jitka | 26 |
| 20 | student409 | 2.0 | 16.5 | 16.0 | missing | missing | 34.5 | missing | true | missing | 15 | 12:45 | 28.5 | missing | 63.0 | D | Jitka | 84 |
| 21 | student474 | 2.5 | 18.5 | 15.0 | missing | 4.5 | 36.0 | missing | true | missing | -1 | --- | missing | missing | missing | F | Jitka | 90 |
| 22 | student478 | 2.0 | 14.0 | 12.0 | missing | missing | 28.0 | missing | true | missing | 20 | 12:30 | 40.0 | missing | 68.0 | D | Jitka | 57 |
| 23 | student510 | 2.0 | 17.5 | 18.0 | missing | 3.0 | 37.5 | missing | true | missing | 8 | --- | missing | missing | missing | F | Jitka | 93 |
| 24 | student519 | 3.0 | 12.5 | 13.5 | missing | 5.0 | 29.0 | missing | true | missing | 18 | 11:00 | 30.0 | missing | 64.0 | D | Jitka | 61 |
| 25 | student533 | 0.5 | 0.5 | missing | missing | missing | 1.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Jitka | 12 |
| 26 | student543 | 3.0 | 11.5 | 13.5 | missing | 1.0 | 28.0 | missing | true | missing | 20 | 12:00 | 53.0 | missing | 82.0 | B | Jitka | 57 |
| 27 | student548 | 1.5 | 16.0 | 14.0 | missing | 2.5 | 31.5 | missing | true | missing | 17 | 10:45 | 45.0 | missing | 79.0 | C | Jitka | 71 |
| 28 | student551 | 3.0 | 11.5 | 8.5 | 7.0 | 0.5 | 23.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Jitka | 39 |
| 29 | student560 | missing | missing | missing | missing | missing | 0.0 | missing | false | missing | missing | missing | missing | missing | missing | missing | Jitka | 8 |
| Row | tutor | studenti | zápočty | psali 1. test | průměrná aktivita | propustnost |
|---|
| String7? | Int64 | Int64 | Int64 | Float64 | Float64 |
|---|
| 1 | Irena | 46 | 32 | 44 | 3.1087 | 0.695652 |
| 2 | Ondra | 46 | 18 | 41 | 2.58453 | 0.391304 |
| 3 | missing | 37 | 8 | 18 | 0.460962 | 0.216216 |
| 4 | Jan S | 48 | 35 | 47 | 0.0625 | 0.729167 |
| 5 | Jakub | 95 | 54 | 83 | 0.847368 | 0.568421 |
| 6 | Jan V | 91 | 72 | 85 | 0.960318 | 0.791209 |
| 7 | Jarda | 101 | 66 | 96 | 2.55996 | 0.653465 |
| 8 | Eva | 30 | 15 | 24 | 1.66667 | 0.5 |
| 9 | Luděk | 47 | 45 | 47 | 1.07447 | 0.957447 |
| 10 | Jitka | 29 | 20 | 27 | 1.72414 | 0.689655 |
| Row | tutor | studenti | zápočty | psali 1. test | průměrná aktivita | propustnost |
|---|
| String7 | Int64 | Int64 | Int64 | Float64 | Float64 |
|---|
| 1 | Irena | 99 | 66 | 91 | 2.33333 | 0.666667 |
| 2 | Ondra | 94 | 51 | 81 | 0.205676 | 0.542553 |
| 3 | Jan V | 144 | 77 | 126 | 0.673611 | 0.534722 |
| 4 | Jarda | 95 | 76 | 92 | 2.59474 | 0.8 |
| 5 | Jan S | 47 | 36 | 45 | 0.255319 | 0.765957 |
| 6 | Jitka | 96 | 60 | 89 | 1.20312 | 0.625 |
| 7 | Jakub | 84 | 40 | 65 | 0.642857 | 0.47619 |
| Row | tutor | studenti | zápočty | psali 1. test | průměrná aktivita | propustnost |
|---|
| String7? | Int64 | Int64? | Int64 | Float64 | Float64? |
|---|
| 1 | Jan V | 146 | missing | 118 | 0.424658 | missing |
| 2 | Pavel | 149 | 84 | 138 | 1.07383 | 0.563758 |
| 3 | Jitka | 73 | 41 | 64 | 1.76027 | 0.561644 |
| 4 | Petr | 99 | missing | 78 | 0.459596 | missing |
| 5 | Jiřina | 68 | 36 | 61 | 1.49265 | 0.529412 |
| 6 | Irena | 74 | 41 | 69 | 1.5 | 0.554054 |
| 7 | Jarda | 100 | 57 | 95 | 2.46 | 0.57 |
| 8 | Jan S | 49 | 15 | 38 | 0.0612245 | 0.306122 |
| 9 | missing | 2 | 0 | 0 | 0.0 | 0.0 |
2.3 Zkouškové období
V BI-MA1 hodně rozdílné vůči BI-ZMA.
| 1 | A | 42 |
| 2 | B | 57 |
| 3 | C | 53 |
| 4 | D | 32 |
| 5 | E | 5 |
| 6 | F | 20 |
| 7 | missing | 319 |
| 1 | A | 79 |
| 2 | B | 104 |
| 3 | C | 110 |
| 4 | D | 41 |
| 5 | E | 6 |
| 6 | F | 37 |
| 7 | missing | 383 |
| 1 | A | 108 |
| 2 | B | 121 |
| 3 | C | 107 |
| 4 | D | 35 |
| 5 | E | 9 |
| 6 | F | 26 |
| 7 | missing | 253 |
| 1 | A | 12 |
| 2 | B | 7 |
| 3 | C | 8 |
| 4 | D | 2 |
| 5 | F | 4 |
| 6 | missing | 5 |
| 1 | A | 13 |
| 2 | B | 12 |
| 3 | C | 4 |
| 4 | D | 1 |
| 5 | F | 2 |
| 6 | missing | 6 |
| 1 | A | 29 |
| 2 | B | 25 |
| 3 | C | 13 |
| 4 | D | 10 |
| 5 | F | 4 |
| 6 | missing | 15 |
| 1 | A | 47 |
| 2 | B | 39 |
| 3 | C | 32 |
| 4 | D | 6 |
| 5 | F | 7 |
| 6 | missing | 28 |
| 1 | A | 58 |
| 2 | B | 43 |
| 3 | C | 16 |
| 4 | D | 7 |
| 5 | F | 7 |
| 6 | missing | 16 |
| 1 | 0 | 4 |
| 2 | 10 | 1 |
| 3 | 11 | 1 |
| 4 | 12 | 3 |
| 5 | 13 | 2 |
| 6 | 14 | 5 |
| 7 | 15 | 29 |
| 8 | 16 | 37 |
| 9 | 17 | 40 |
| 10 | 18 | 37 |
| 11 | 19 | 34 |
| 12 | 20 | 16 |
| 13 | missing | 319 |
| 1 | 0 | 4 |
| 2 | 10 | 1 |
| 3 | 11 | 1 |
| 4 | 12 | 3 |
| 5 | 13 | 2 |
| 6 | 14 | 5 |
| 7 | 15 | 29 |
| 8 | 16 | 37 |
| 9 | 17 | 40 |
| 10 | 18 | 37 |
| 11 | 19 | 34 |
| 12 | 20 | 16 |