ããã«ã¡ã¯ããµãŒããŒãµã€ããšã³ãžãã¢ã®ç«¹æ¬ã§ãã ãã®èšäºã¯ Enigmo Advent Calendar 2020 ã®3æ¥ç®ã®èšäºã§ãã ã¿ãªããŸã¯2020幎ã«è²·ã£ãäžã§ããã£ããã®ã¯ãªãã§ãããïŒ ç§ã¯ iPad ã§ãã ææ° Apple iPad Pro (12.9ã€ã³ã, Wi-Fi, 128GB) - ã·ã«ã㌠(第4äžä»£) çºå£²æ¥: 2020/03/25 ã¡ãã£ã¢: Personal Computers 䞻㫠kindle ãèŠéãã§èªãããšã«æŽ»çšããŠããŸãã ãšãã°ã¢ ã®çŠå©åçã®äžã€ããšã³ãžãã¢ãµããŒããã§5äžåã®è£å©ãåããŸãããããŒãã https://enigmo.co.jp/recruit/culture/ ãããŠã¿ãªããŸã¯éЬåžãè²·ã£ãŠããŸããïŒ éŠ¬åžã¯ç«¶éЬã«è³ããéã«è³Œå
¥ããæç¥šåžã§ãã 1å£100åããããããã§ãæ°è»œã«è³Œå
¥ããããšãã§ããŸãã(競銬ã¯20æ³ãã) åŒç€Ÿã«ãæ°å競銬奜ããåšç±ããŠãããææç«¶éŠ¬ ããŒã¯ ã§çãäžããããšããããŸãã ã競銬å¿
忬ãã¯æ¬åœã«åœããã®ãïŒ ã競銬å¿
忬ããšãããã®ãå··ã«ã¯ãããŸãã ãããã£ãæ¬ã§ã¯åã£ãæã®éЬåžãšææ»éã ãã»ã³ã»ãŒã·ã§ãã«ã«ç޹ä»ãããŠããããšãå€ãã å®éã©ã®ãããè³ããŠã©ã®ãããåœãã£ãŠããã®ãããããªãããšãå€ãã§ãã ãªã®ã§å®éã«æ¬ã®éãã«éЬåžã賌å
¥ããŠãããã©ã®ãããåã€ã®ããæ€èšŒããŠã¿ãŸãã ä»ååèã«ããã®ã¯ã競銬åãäžããéŠ¬åž çµ±èšåŠ ã®æç§æžãã§ãã 競銬åãäžãã銬åžçµ±èšåŠã®æç§æž äœè
: 倧谷æž
æ çºå£²æ¥: 2019/10/31 ã¡ãã£ã¢: Kindle ç äžã«ãã競銬å¿
忬ã®äžã§ãããªããºã®ããŒã¿ããåã¡éЬãéžã¶ãšãããã¿ã€ãã«ã®éãçµ±èšçãªèŠçŽ ã®å€ãæ¬ã«ãªã£ãŠããŸãã ãŸã銬ããžã§ãããŒã«é¢ä¿ãªããªããºã®ã¿ãåèã«ããŠããã®ã§ãããŸããŸãªã¬ãŒã¹ããããªãã§æ±çšçã«æŽ»çšã§ãããšããå©ç¹ããããŸãã ãã®æ¬ã«æžããŠããããšããã£ãããŸãšãããšä»¥äžã®ããã«ãªããŸãã äžéŠ¬åž ãçã çäžçã§ã¯ãªãååçã«ãã ãã 3000ååã®éЬåžãæžãç¶ããŠ3åã«1å10000åãåœããã°1000åãã©ã¹ãšããããšã§ãããã€ãŸãåœããã° äžéŠ¬åž ã«ãªããããªã¬ãŒã¹ãã穎銬ããåã¡éЬã«ãã銬åžãè²·ãå¿
èŠããããŸãã ã éŠ¬é£ ãªããºã®å£ãã®æ³åã§ç©ŽéЬãéžãŒã æ¬æžã®äžã§æãã·ã³ãã«ãªç©ŽéŠ¬éžææ¹æ³ãšããŠç޹ä»ãããŠããã®ãã éŠ¬é£ ãªããºã®å£ãã®æ³åã§ãã ( éŠ¬é£ ãšã¯äžäœ2é ãåœãŠã銬åžã®ããš) ãã®æ³åãç°¡åã«ç޹ä»ãããš ã¬ãŒã¹ 14é 以äžåºèµ°ãã éŠ¬é£ 1äœäººæ°ãªããºã9åä»¥äž åå ãªããº30å以å
ã®éЬã10é ä»¥äž éŠ¬é£ ãªããº1äœäººæ°éŠ¬ã« åå 1äœäººæ°éЬãå«ãŸãã 穎銬 åå 1äœäººæ°éŠ¬ã® éŠ¬é£ ãªããºã人æ°é ã«äžŠã¹ãæ1.8å以äžã®å£ãããæããã®å£ã®å2é ãéžã¶ 銬åžã®çµã¿ç«ãŠæ¹ åå 1äœäººæ°éŠ¬ã® éŠ¬é£ ãªããºäººæ°é ã®ã1~4äœã - ã5~8äœã - ã穎銬ã ã®ãã©ãŒã¡ãŒã·ã§ã³äžé£è€ (äžé£è€: äžäœ3é ãé äžåã«éžã¶éЬåž) (ãã©ãŒã¡ãŒã·ã§ã³ãšã¯ https://www.jra.go.jp/kouza/yougo/w528.html ) æ¬æžã§ã¯ããŸããŸãªæ¡ä»¶ãçµã¿åãããŠéЬåžãéžæããŠããŸããã ä»åã¯å®è£
ã®ç°¡äŸ¿ãã®ããæ¡ä»¶ãç°¡ç¥åããŠããããšããäºæ¿ãã ããã å®éã®ã³ãŒã ä»å㯠Google Colabratoryãå©çšããŸããã Google ã¢ã«ãŠã³ããããã°ç°å¢æ§ç¯ããªãã« python ã䜿ããŠäŸ¿å©ã§ããã https://colab.research.google.com/ 2020/11/27çŸåšåãããšã確èªããŠããã®ã§ã¿ãªããŸããã²äœ¿ã£ãŠã¿ãŠãã ããã ãŸãå¿
èŠãªã©ã€ãã©ãªã®ã€ã³ã¹ããŒã«ããŸã !apt-get update !apt install chromium-chromedriver !cp /usr/lib/chromium-browser/chromedriver /usr/ bin !pip install selenium !pip install lxml importããŸã import pandas as pd from bs4 import BeautifulSoup import urllib.request as req from selenium import webdriver from selenium.webdriver.chrome.options import Options import numpy as np import urllib.parse ä»åçšæãã颿° def set_selenium (): options = webdriver.ChromeOptions() options.add_argument( '--headless' ) options.add_argument( '--no-sandbox' ) options.add_argument( '--disable-dev-shm-usage' ) driver = webdriver.Chrome( 'chromedriver' ,options=options) driver.implicitly_wait( 15 ) return driver def get_raceids (date): url = "https://race.netkeiba.com/top/race_list_sub.html?kaisai_date=" + date res = req.urlopen(url) racesoup = BeautifulSoup(res, "html.parser" ) racelist = racesoup.select( "#RaceTopRace > div > dl > dd > ul > li > a:nth-of-type(1)" ) raceids = [ urllib.parse.parse_qs(urllib.parse.urlparse(race.get( 'href' )).query)[ 'race_id' ][ 0 ] for race in racelist ] return raceids def get_tansho_ichiban (raceid): driver = set_selenium() driver.get( "https://race.netkeiba.com/odds/index.html?type=b1&race_id=" +raceid+ "&rf=shutuba_submenu" ) html = driver.page_source.encode( 'utf-8' ) tanhukusoup = BeautifulSoup(html, "html.parser" ) tanhukudfs = pd.read_html( str (tanhukusoup.html)) tansho_ichiban = tanhukudfs[ 0 ][tanhukudfs[ 0 ][ 'ãªããº' ] == tanhukudfs[ 0 ][ 'ãªããº' ].min()][ '銬çª' ].values[ 0 ] return tansho_ichiban, tanhukudfs[ 0 ] def get_umarenodds (raceid): driver = set_selenium() driver.get( "https://race.netkeiba.com/odds/index.html?type=b4&race_id=" +raceid+ "&housiki=c0&rf=shutuba_submenu" ) html = driver.page_source.encode( 'utf-8' ) soup = BeautifulSoup(html, "html.parser" ) dfs = pd.read_html( str (soup.html)) umarendf = pd.DataFrame(index=[ 1 ]) for i, df in enumerate (dfs): umarendf = pd.concat([umarendf, df.set_index( str (i+ 1 )).dropna(how= 'all' , axis= 1 )], axis= 1 ) if umarendf.isin([ 'åæ¶' ]).values.any() | umarendf.isin([ 'é€å€' ]).values.any(): return False umarendf[umarendf.index.max()]= 0 umarenodds = pd.DataFrame(umarendf.fillna( 0 ).astype( 'float64' ).values + umarendf.astype( 'float64' ).fillna( 0 ).values.T, columns= list ( map ( int , map ( float ,umarendf.columns))), index=umarendf.index).replace( 0 ,np.nan) return umarenodds def get_baken (raceid): tansho_ichiban, tanhukudf = get_tansho_ichiban(raceid) umarenodds = get_umarenodds(raceid) if umarenodds is False : return False umarenninki = umarenodds.min() umaren_ichiban = umarenninki[umarenninki == umarenninki.min()].index if umarenninki.min() >= 9 and any (umaren_ichiban == tansho_ichiban) and umarenninki.index.max() >= 14 and sum (tanhukudf[ "ãªããº" ]<= 30 )>= 10 : ninkiuma = umarenodds[tansho_ichiban].sort_values() anaumalist = [] for idx in np.where((ninkiuma/ninkiuma.shift( 1 ) > 1.8 ).values == True )[ 0 ]: two = idx - 1 anaumalist = anaumalist + (ninkiuma/ninkiuma.shift( 1 ) > 1.8 ).index.values[two:two+ 2 ].tolist() if not anaumalist: return False formation1 = ninkiuma.fillna( 0 ).sort_values()[ 0 : 4 ].index.values formation2 = ninkiuma.fillna( 0 ).sort_values()[ 4 : 8 ].index.values return { 'anauma' :anaumalist, 'formation1' :formation1, 'formation2' :formation2} return False def get_dayresult (date): kakekin = 0 modorikin = 0 raceids = get_raceids(date) for raceid in raceids: baken = get_baken(raceid) if not baken: continue result = pd.read_html( "https://race.netkeiba.com/race/result.html?race_id=" +raceid) sanrenpuku = list ( map ( int ,result[ 2 ].set_index( 0 )[ 1 ][ '3é£è€' ].split())) money = int (result[ 2 ].set_index( 0 )[ 2 ][ '3é£è€' ].replace( ',' , '' ).replace( 'å' , '' )) if bool ( set (sanrenpuku) & set (baken[ 'formation1' ])) & bool ( set (sanrenpuku) & set (baken[ 'formation2' ])) & bool ( set (sanrenpuku) & set (baken[ 'anauma' ])): kakekin += 100 * len (baken[ 'formation1' ])* len (baken[ 'formation2' ])* len (baken[ 'anauma' ]) modorikin += money else : kakekin += 100 * len (baken[ 'formation1' ])* len (baken[ 'formation2' ])* len (baken[ 'anauma' ]) cols = [ "è³ãé" , "ææ»é" ] return pd.Series([kakekin,modorikin],index=cols,name=date) netkeibaãã Selenium + BeautifulSoupã§ãªããºã®ããŒã¿ã ã¹ã¯ã¬ã€ãã³ã° ããŸãã ãã®çµæãpandasã§ååŠçããäžèšã® ã éŠ¬é£ ãªããºã®å£ãã®æ³å ãã銬åžãéžæ éžæãã銬åžãåœãã£ãŠããã®ããæ€èšŒããŸãã ååçã«ãã ããã®ãæ¬æžã®æ¹éãªã®ã§ãæ€èšŒé
ç®ãšã㊠æ¡ä»¶ã«åèŽããå
šãŠã®äžé£è€éЬåžã100åã§è³Œå
¥ = è³ãé åœãã£ãã¬ãŒã¹ã®å®éã®ææ»é 以äžã®å·®é¡ãèŠãããšã«ããŸãã ä»åã¯11æã®1~23æ¥ãŸã§ã®ã¬ãŒã¹ãæ€èšŒããŸãã # é嬿¥ã®ãªã¹ã datelist = [ '20201101' , '20201107' , '20201108' , '20201114' , '20201115' , '20201121' , '20201122' , '20201123' ] moukaridf = pd.DataFrame() for date in datelist: onedaydf = get_dayresult(date) moukaridf = moukaridf.append(onedaydf) æéããããŸããåŸ
ã¡ãŸããã çµæ moukaridf.sum()[ 'ææ»é' ] - moukaridf.sum()[ 'è³ãé' ] # åºå - 14630.0 éé¡ãšããŠã¯14630åè² ããŠããŸããŸããã moukaridf ææ»é è³ãé 20201101 33500.0 12800.0 20201107 0.0 9600.0 20201108 5560.0 16000.0 20201114 0.0 25600.0 20201115 45510.0 25600.0 20201121 0.0 0.0 20201122 0.0 9600.0 20201123 0.0 0.0 æ¥å¥ã«èŠããšåã£ãŠããæ¥ããããŸããã æ¬æžã§ã¯ååçãäžããããã®éЬåžéžææ¹æ³ãããã«çްãã玹ä»ãããŠããã®ã§ããã®éãã«å®è£
ããã°ãã£ãšè¯ãçµæãšãªããããããŸããã tæ€å®ããããšããè³ãéãææ»éã®å·®é¡ã®å¹³åã0ã§ã¯ãªã(æ£è² ã©ã¡ããã«åŸã)ããšãã åž°ç¡ä»®èª¬ ãæ£åŽãããŸãã(p > 0.05 å¹³å -1828.8 ± 14776 å) sagaku = moukaridf[[ 'ææ»é' ]].values - moukaridf[[ 'è³ãé' ]].values # å¹³å print (sagaku.mean()) # æšæºåå·® print (sagaku.std()) # åºå - 1828.75 14776.743076114573 ãã£ãŠçµè«ã¯ãåã€ããšãããã°è² ããããšãããïŒã 銬åžãè²·ãã ã§ã¯ãã£ããäœã£ãã®ã§å®éã«è³ããŠã¿ãããšæããŸãïŒ æ€èšŒã§ã¯æçµãªããºãã銬åžãéžæããŠããŸããããåœæ¥ã¯10:30ã®ãªããºãå
ã«éЬåžãéžæããŸãã äºç®ã®é¢ä¿ã§æ¡ä»¶ã«åèŽããå
šãŠã®éЬåžã§ã¯ãªãã1ã¬ãŒã¹éžæããŠãã©ãŒã¡ãŒã·ã§ã³3é£è€éЬåžã賌å
¥ããŸãã 11/29ã®10:30æç¹ã§åè£ã4ã¬ãŒã¹ãããŸããã date = "20201129" raceids = get_raceids(date) for raceid in raceids: baken = get_baken(raceid) if baken: print ( 'https://race.netkeiba.com/race/shutuba.html?race_id=' +raceid) print (baken) print ( "=======================" ) ã # åºå # æ¡ä»¶ã«åèŽããã¬ãŒã¹ã®URLãšè³Œå
¥ãã¹ã銬åžãããªã³ãããã https://race.netkeiba.com/race/shutuba.html?race_id= 202005050907 { 'anauma' : [ 4 , 16 ], 'formation1' : array([ 7 , 6 , 3 , 2 ]), 'formation2' : array([ 10 , 14 , 15 , 8 ])} ============ https://race.netkeiba.com/race/shutuba.html?race_id= 202005050911 { 'anauma' : [ 11 , 6 , 1 , 16 ], 'formation1' : array([ 4 , 14 , 9 , 13 ]), 'formation2' : array([ 12 , 5 , 2 , 3 ])} ============ https://race.netkeiba.com/race/shutuba.html?race_id= 202009050904 { 'anauma' : [ 3 , 7 ], 'formation1' : array([ 8 , 15 , 5 , 1 ]), 'formation2' : array([ 16 , 4 , 13 , 12 ])} ============ https://race.netkeiba.com/race/shutuba.html?race_id= 202009050912 { 'anauma' : [ 7 , 6 ], 'formation1' : array([ 13 , 3 , 10 , 11 ]), 'formation2' : array([ 2 , 9 , 15 , 5 ])} ============ ä»åã¯æ±äº¬7Rã«è³ããŸãïŒ (åºåã§äžçªäžã®ã¬ãŒã¹) 穎銬ãã4,16ã1~4äœãã7, 6, 3, 2ã,5~8äœãã10, 14, 15, 8ãã§ãã 2020幎11æ29æ¥ã¢ã¯ã»ã¹ https://www.ipat.jra.go.jp/ é ŒãããïŒ çµæã¯âŠ 1äœ 2çª 2äœ 10çª 3äœ 8çª 3歳以上2勝クラス 結果・払戻 | 2020年11月29日 東京7R レース情報(JRA) - netkeiba.com æ®å¿µïŒç©ŽéЬåè£ã ã£ã4çªãš16çªã3äœä»¥å
ã«å
¥ããŸããã§ãããæããã£ãã§ããã ããã§ã¯ã¿ãªããŸãè¯ã競銬ã©ã€ãããéããã ããã ã¹ã¯ã¬ã€ãã³ã° ãã¯ããŒãªã³ã°ã¯ã¢ã¯ã»ã¹å
ã«é
æ
®ããŠãããŸãããã åèè³æ 競銬åãäžãã銬åžçµ±èšåŠã®æç§æž å¢è£æ¹èšPythonã«ããã¹ã¯ã¬ã€ãã³ã°&æ©æ¢°åŠç¿ éçºãã¯ãã㯠ColaboratoryでSeleniumが使えた:JavaScriptで生成されるページも簡単スクレイピング - Qiita ææ¥ã®èšäºã®æ
åœã¯ãµãŒããŒãµã€ããšã³ãžãã¢ã®å¯ºç°ããã§ãããæ¥œãã¿ã«ã æ ªåŒäŒç€Ÿ ãšãã°ã¢ æ£ç€Ÿå¡ã®æ±äººäžèЧ hrmos.co