{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Feladatok\n", "Oldjuk meg az alábbi feladatokat a pandas segítségével!\n", "\n", "A [pl.txt](pl.txt) szövegfájl a Premier League 2011-12-es szezonjának eredményeit tartalmazza. Készítsünk programot, amely:\n", "- Betölti a fájl adatait egy DataFrame-be!\n", "- Meghatározza, hogy:\n", " - hány mérkőzés volt ez egyes fordulókban?\n", " - hány gól esett az egyes fordulókban?\n", " - mennyi volt az átlagos mérkőzésenkénti gólszám az egyes fordulókban?\n", " - melyik fordulóban esett a legtöbb gól?\n", " - a mérkőzések hány százalékán esett gól?\n", " - melyik mérkőzésen esett a legtöbb gól?\n", " - hány gólt rúgott összesen a Manchester United?\n", "\n", "- Kiírja: \n", " - a tíz leggólgazdagabb fordulót a gólszámokkal együtt!\n", " - a 10., 20. és 30. fordulóban hány gól esett összesen!" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# A megoldáshoz a pandas-t használjuk.\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
roundhteamateamhgoalsagoals
01Blackburn RoversWolverhampton Wanderers12
11Fulham FCAston Villa00
21Liverpool FCSunderland AFC11
31Queens Park RangersBolton Wanderers04
41Wigan AthleticNorwich City11
..................
37538Sunderland AFCManchester United01
37638Swansea CityLiverpool FC10
37738Tottenham HotspurFulham FC20
37838West Bromwich AlbionArsenal FC23
37938Wigan AthleticWolverhampton Wanderers32
\n", "

380 rows × 5 columns

\n", "
" ], "text/plain": [ " round hteam ateam hgoals agoals\n", "0 1 Blackburn Rovers Wolverhampton Wanderers 1 2\n", "1 1 Fulham FC Aston Villa 0 0\n", "2 1 Liverpool FC Sunderland AFC 1 1\n", "3 1 Queens Park Rangers Bolton Wanderers 0 4\n", "4 1 Wigan Athletic Norwich City 1 1\n", ".. ... ... ... ... ...\n", "375 38 Sunderland AFC Manchester United 0 1\n", "376 38 Swansea City Liverpool FC 1 0\n", "377 38 Tottenham Hotspur Fulham FC 2 0\n", "378 38 West Bromwich Albion Arsenal FC 2 3\n", "379 38 Wigan Athletic Wolverhampton Wanderers 3 2\n", "\n", "[380 rows x 5 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Töltsük be a pl.txt fájl adatait DataFrame-be!\n", "names = ['round', 'hteam', 'ateam', 'hgoals', 'agoals']\n", "df = pd.read_csv('pl.txt', sep='\\t', skiprows=6, names=names)\n", "df" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 380 entries, 0 to 379\n", "Data columns (total 5 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 round 380 non-null int64 \n", " 1 hteam 380 non-null object\n", " 2 ateam 380 non-null object\n", " 3 hgoals 380 non-null int64 \n", " 4 agoals 380 non-null int64 \n", "dtypes: int64(3), object(2)\n", "memory usage: 15.0+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
roundhgoalsagoals
count380.000000380.000000380.000000
mean19.5000001.5894741.215789
std10.9803131.3315311.204987
min1.0000000.0000000.000000
25%10.0000001.0000000.000000
50%19.5000001.0000001.000000
75%29.0000002.0000002.000000
max38.0000008.0000006.000000
\n", "
" ], "text/plain": [ " round hgoals agoals\n", "count 380.000000 380.000000 380.000000\n", "mean 19.500000 1.589474 1.215789\n", "std 10.980313 1.331531 1.204987\n", "min 1.000000 0.000000 0.000000\n", "25% 10.000000 1.000000 0.000000\n", "50% 19.500000 1.000000 1.000000\n", "75% 29.000000 2.000000 2.000000\n", "max 38.000000 8.000000 6.000000" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
roundhteamateamhgoalsagoalsgoals
01Blackburn RoversWolverhampton Wanderers123
11Fulham FCAston Villa000
21Liverpool FCSunderland AFC112
31Queens Park RangersBolton Wanderers044
41Wigan AthleticNorwich City112
.....................
37538Sunderland AFCManchester United011
37638Swansea CityLiverpool FC101
37738Tottenham HotspurFulham FC202
37838West Bromwich AlbionArsenal FC235
37938Wigan AthleticWolverhampton Wanderers325
\n", "

380 rows × 6 columns

\n", "
" ], "text/plain": [ " round hteam ateam hgoals agoals \\\n", "0 1 Blackburn Rovers Wolverhampton Wanderers 1 2 \n", "1 1 Fulham FC Aston Villa 0 0 \n", "2 1 Liverpool FC Sunderland AFC 1 1 \n", "3 1 Queens Park Rangers Bolton Wanderers 0 4 \n", "4 1 Wigan Athletic Norwich City 1 1 \n", ".. ... ... ... ... ... \n", "375 38 Sunderland AFC Manchester United 0 1 \n", "376 38 Swansea City Liverpool FC 1 0 \n", "377 38 Tottenham Hotspur Fulham FC 2 0 \n", "378 38 West Bromwich Albion Arsenal FC 2 3 \n", "379 38 Wigan Athletic Wolverhampton Wanderers 3 2 \n", "\n", " goals \n", "0 3 \n", "1 0 \n", "2 2 \n", "3 4 \n", "4 2 \n", ".. ... \n", "375 1 \n", "376 1 \n", "377 2 \n", "378 5 \n", "379 5 \n", "\n", "[380 rows x 6 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# \"Gólok száma\" oszlop felvétele.\n", "df['goals'] = df['hgoals'] + df['agoals']\n", "df" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "round\n", "1 10\n", "2 10\n", "3 10\n", "4 10\n", "5 10\n", "6 10\n", "7 10\n", "8 10\n", "9 10\n", "10 10\n", "11 10\n", "12 10\n", "13 10\n", "14 10\n", "15 10\n", "16 10\n", "17 10\n", "18 10\n", "19 10\n", "20 10\n", "21 10\n", "22 10\n", "23 10\n", "24 10\n", "25 10\n", "26 10\n", "27 10\n", "28 10\n", "29 10\n", "30 10\n", "31 10\n", "32 10\n", "33 10\n", "34 10\n", "35 10\n", "36 10\n", "37 10\n", "38 10\n", "dtype: int64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Példák csoportosításra (groupby).\n", "# Hány mérkőzés volt az egyes fordulókban?\n", "df.groupby('round').size()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1 10\n", "29 10\n", "22 10\n", "23 10\n", "24 10\n", "25 10\n", "26 10\n", "27 10\n", "28 10\n", "30 10\n", "2 10\n", "31 10\n", "32 10\n", "33 10\n", "34 10\n", "35 10\n", "36 10\n", "37 10\n", "21 10\n", "20 10\n", "19 10\n", "18 10\n", "3 10\n", "4 10\n", "5 10\n", "6 10\n", "7 10\n", "8 10\n", "9 10\n", "10 10\n", "11 10\n", "12 10\n", "13 10\n", "14 10\n", "15 10\n", "16 10\n", "17 10\n", "38 10\n", "Name: round, dtype: int64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# (majdnem) ugyanez, tömörebben\n", "df['round'].value_counts()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "round\n", "1 20\n", "2 23\n", "3 31\n", "4 22\n", "5 38\n", "6 27\n", "7 34\n", "8 32\n", "9 31\n", "10 39\n", "11 31\n", "12 30\n", "13 25\n", "14 30\n", "15 30\n", "16 18\n", "17 31\n", "18 19\n", "19 26\n", "20 29\n", "21 22\n", "22 35\n", "23 23\n", "24 31\n", "25 36\n", "26 30\n", "27 24\n", "28 17\n", "29 27\n", "30 25\n", "31 35\n", "32 23\n", "33 29\n", "34 28\n", "35 22\n", "36 35\n", "37 26\n", "38 32\n", "Name: goals, dtype: int64" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Hány gól esett az egyes fordulókban?\n", "df.groupby('round')['goals'].sum()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Melyik fordulóban esett a legtöbb gól?\n", "df.groupby('round')['goals'].sum().idxmax()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "round\n", "1 2.0\n", "2 2.3\n", "3 3.1\n", "4 2.2\n", "5 3.8\n", "6 2.7\n", "7 3.4\n", "8 3.2\n", "9 3.1\n", "10 3.9\n", "11 3.1\n", "12 3.0\n", "13 2.5\n", "14 3.0\n", "15 3.0\n", "16 1.8\n", "17 3.1\n", "18 1.9\n", "19 2.6\n", "20 2.9\n", "21 2.2\n", "22 3.5\n", "23 2.3\n", "24 3.1\n", "25 3.6\n", "26 3.0\n", "27 2.4\n", "28 1.7\n", "29 2.7\n", "30 2.5\n", "31 3.5\n", "32 2.3\n", "33 2.9\n", "34 2.8\n", "35 2.2\n", "36 3.5\n", "37 2.6\n", "38 3.2\n", "Name: goals, dtype: float64" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Az átlagos mérkőzésenkénti gólszám az egyes fordulókban.\n", "df.groupby('round')['goals'].mean()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "round\n", "10 39\n", "5 38\n", "25 36\n", "22 35\n", "36 35\n", "31 35\n", "7 34\n", "38 32\n", "8 32\n", "9 31\n", "Name: goals, dtype: int64" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Írjuk ki a tíz leggólgazdagabb fordulót a gólszámokkal együtt!\n", "df.groupby('round')['goals'].sum().sort_values(ascending=False).head(10)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "92.89473684210526" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# A mérkőzések hány százalékán esett gól?\n", "(df['goals'] > 0).mean() * 100" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "round 3\n", "hteam Manchester United\n", "ateam Arsenal FC\n", "hgoals 8\n", "agoals 2\n", "goals 10\n", "Name: 29, dtype: object" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Melyik mérkőzésen esett a legtöbb gól?\n", "df.loc[df['goals'].idxmax()]" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "93" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Írjuk ki, hogy a 10., 20. és 30. fordulóban hány gól esett összesen!\n", "df[(df['round'] == 10) | (df['round'] == 20) | (df['round'] == 30)]['goals'].sum()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "93" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ...alternatív megoldás:\n", "df[df['round'].isin({10, 20, 30})]['goals'].sum()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "93" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ...harmadik megoldás:\n", "df.groupby('round')['goals'].sum()[[10, 20, 30]].sum()" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "93" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ...negyedik megoldás:\n", "df[(df['round'] % 10 == 0)]['goals'].sum()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "89" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Hány gólt rúgott összesen a Manchester United?\n", "df[df['hteam'] == 'Manchester United']['hgoals'].sum() + df[df['ateam'] == 'Manchester United']['agoals'].sum()" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "89" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ...alternatív megoldás:\n", "sum([df[df[f'{p}team'] == 'Manchester United'][f'{p}goals'].sum() for p in 'ha'])" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 2 }