ããŒã¿æŽ»çšã®åŒ·ã峿¹ãDatabricksãã§å®çŸããããŒã¿åºç€æ§ç¯ãçŸå Žã§ã®æŽ»çšãã€ã³ããäºäŸã解説
ããã°ããŒã¿ãæ±ãã¯ã©ãŠãååæãã©ãããã©ãŒã ãšããŠãæšä»æ³šç®ãéãããDatabricksããDatabricks掻çšã®ãã€ã³ããæŽ»çšäºäŸããã¹ããã©ã¯ãã£ã¹ã«ã€ããŠãããŒã¿ããªãã¯ã¹ã»ãžã£ãã³ã®ã·ãã¢ãœãªã¥ãŒã·ã§ã³ã¢ãŒããã¯ãã§ãã匥çéææ°ã解説ãDatabricksã顧客ã®DX忝æŽãœãªã¥ãŒã·ã§ã³ãšããŠææ¡ããïœïœæ ªåŒäŒç€Ÿç€Ÿå¡ãšã®Q&Aã»ãã·ã§ã³ãšåãããŠã¬ããŒããããã¢ãŒã«ã€ãåç»
Databricksã¬ã€ã¯ããŠã¹ãã©ãããã©ãŒã ãšã¯

ããŒã¿ããªãã¯ã¹ã»ãžã£ãã³æ ªåŒäŒç€Ÿ
Databricksã·ãã¢ãœãªã¥ãŒã·ã§ã³ã¢ãŒããã¯ãã匥ç éææ°
匥çéææ°ã¯ç波倧åŠå€§åŠé¢ä¿®äºåŸãåœå ã®ãšã³ã¿ãŒãã©ã€ãºããã®ã¥ããç³»äŒæ¥ã§ã·ã¹ãã éçºãèªç¶èšèªåŠçã®ç ç©¶éçºã«åŸäºãæµ·å€èµŽä»»äžã«ããã°ããŒã¿ãœãªã¥ãŒã·ã§ã³ã®éçºã«æºãããå€è³ç³»ã³ã³ãµã«ãã£ã³ã°ãã¡ãŒã ã§ããŒã¿åæãããžã§ã¯ããçµãŠãããŒã¿ããªãã¯ã¹ã»ãžã£ãã³ã«ãžã§ã€ã³ããã
çŸåšã¯ãDatabricksã·ãã¢ãœãªã¥ãŒã·ã§ã³ã¢ãŒããã¯ããšããŠã補è¬ã»å°å£²ã»è£œé æ¥ãšãã£ãé¡§å®¢äŒæ¥ãäžå¿ã«ãDatabrickså°å ¥ã«ããããŒã¿æŽ»çšãã³ã¹ãåæžãšãã£ãæ¯æŽæ¥åãæšé²ããŠããã
ããŒã¿ããªãã¯ã¹ã¯ãApache Sparkãªã©OSSã®éçºè ãã¡è€æ°åã2013幎ã«åµæ¥ããã¹ã¿ãŒãã¢ããã§ãããããŒã¿ã¬ã€ã¯ãšããŒã¿ãŠã§ã¢ããŠã¹ãéãããã¬ã€ã¯ããŠã¹ã«ã³ãããŒããæšæŠããŠããã®ãç¹åŸŽã§ããã
Databricksãå°å ¥ããäŒæ¥æ°ã¯1äžç€Ÿãè¶ ããåŸæ¥å¡æ°ã¯ã°ããŒãã«ã§5000å以äžãæ¥æ¬ãªãŒãžã§ã³ã§ã¯çŸåšçŽ1500åãšãäŒæ¥èŠæš¡ã»ãµãŒãã¹ãšãã«æ¥æé·äžã®ããŒã¿/AIã®ãªãŒãã«ã³ãããŒã ã

åŒ¥çæ°ã¯ãããžã¿ã«ãã©ã³ã¹ãã©ãŒã¡ãŒã·ã§ã³ïŒä»¥äžãDXïŒã®åãçµã¿ãæãããã«é²ãŸãªãäŒæ¥ãå€ãèŠãŠãããšèªããããã«ã¯ãçµç¹ããã·ã¹ãã ãã人ããšå€§ãã3ã€ã®é åã§ãããã課é¡ããããDatabricksãæäŸããããŒã¿ã¬ã€ã¯ããŠã¹ã®ä»çµã¿ããµãŒãã¹ã掻çšããã°è§£æ±ºã«å°ãããšãã§ãããšå匷ã話ããDatabricksã®æ©èœã«ã€ããŠè§£èª¬ãè¡ã£ãã
ããŒã¿ã¬ã€ã¯ããŠã¹ã¯ãããŒã¿ã¬ã€ã¯ãããŒã¿ãŠã§ã¢ããŠã¹ãéããèšèã§ãããšãšãã«ãæ©èœã«ãããŠãããããã®ã¡ãªãããåã蟌ãã ã·ã¹ãã ã§ãããããŒã¿ã¬ã€ã¯ã¯ããŒã¿ã®çš®é¡ããµã€ãºåããäœã§ãæ ŒçŽã§ãããšããã¡ãªãããããåé¢ãèŠç¹ãå€ãããšéå€ãªããŒã¿ã倧éã«èç©ãããŠãããããå¿ èŠãªããŒã¿ãç¬æã«æ€çŽ¢ããããšãé£ããã
ãã®ãããªãã¡ãªããããããŒã¿ãŠã§ã¢ããŠã¹ïŒä»¥äžãDWHïŒãšéããããšã§è§£æ¶ããŠããããã®çµæããã¿ãã€ããªãŒããŒã®ããŒã¿ãäžç¬ã§æ€çŽ¢å¯èœã§ãããå®éã顧客ã®äžã€ã§ããApple瀟ã¯ãäžæ£ã¢ã¯ã»ã¹æ€ç¥ãã¯ã€ãã¯ã«è¡ãæ¥åã«ãããŒã¿ã¬ã€ã¯ããŠã¹ã䜿ã£ãŠãããšãåŒ¥çæ°ã¯èªã£ãã
ããŒã¿ã¬ã€ã¯ã§ãããªããããDWHã®æ©èœã§ããBIããã·ã¥ããŒãã®æ§ç¯ããAIæ©æ¢°åŠç¿ãžã®åãçµã¿ãªã©ãè¡ãããåœç¶ãããŒã¿ã¬ã€ã¯ãèç©ããéæ§é åããŒã¿ã«å ããéåžžDWHãæ±ãããŒãã«ããŒã¿ãªã©ã®æ§é åããŒã¿ãã©ã¡ããåãããã«æ±ããã
ããã«ã¯ãããããŒã¿ã ãã§ãªãããšããžããã€ã¹ããé£ãã§ãããªã¢ã«ã¿ã€ã ã¹ããªãŒãã³ã°ããŒã¿ãªã©ãæ±ãããšãå¯èœã ã

ç¶ããŠåŒ¥çæ°ã¯ãDatabricksã®ç°¡åãªæ§é ãšç¹åŸŽã玹ä»ãããã¯ã©ãŠãã«ç¹åããŠãããµãŒãã¹ã§ããã倧æ3瀟ã®ã¯ã©ãŠããµãŒãã¹ã«å¯Ÿå¿ããŠããã嵿¥è ãOSSã®éçºè ãšããããšãããããªãŒãã³ãœãŒã¹ã®æ°é¢šã奜ãäŒç€Ÿã»ãµãŒãã¹ã§ããäžæ¹ã§ãã»ãã¥ãªãã£ããããžã¡ã³ããã¬ããã³ã¹ãšãã£ãç¹ããã£ãããšæ ä¿ããæ§æãšãªã£ãŠããã
äžè¬çãªããŒã¿ãã©ãããã©ãŒã ãµãŒãã¹ã掻çšããããŒã¿æŽ»çšåºç€ã§ã¯ãELTïŒæœåºãæžã蟌ã¿ã倿ïŒãšãã£ãããŒã¿åŠçããã¯ã¬ã³ãžã³ã°ãæ ãããŒã¿ãšã³ãžãã¢ãããŒã¿ã®ãã©ãŒãããã®é åã«ãã£ãŠå€ãã£ãŠããã®ãäžè¬çã ã
察ããŠDatabricksã§ã¯ããããŸã§æ±ãããŒã¿ã¯äžã€ã ãã§ãããåçš®ããŒã¿é¢é£ã®ãšã³ãžãã¢ãã¡ã³ããŒãããã¹ãŠå ±éã®ããŒã¿ãæ±ããç¹åŸŽã®äžã€ã§ããããã·ã³ãã«ãã«å¯ŸããåŒ¥çæ°ã¯æ¬¡ã®ããã«èªãã
ãSingle Source of Truthãä¿¡é Œã§ããå¯äžã®æ å ±æºãšããããšã«ãªããŸããïŒåŒ¥çæ°ïŒ
ã·ã³ãã«ãšããç¹åŸŽã«ãããŠãUIã¯GUIã§ãããJupyter Notebookã«è¿ãããã®ãããåãµãŒãã¹ã䜿ã£ãããšããã人ã§ããã°ãããããã«äœ¿ããããã«ãªããŸãããšãåŒ¥çæ°ã¯è£è¶³ããã

ç¹åŸŽã®äºã€ç®ã¯ãããªãŒãã³ãã§ãããOSSã§ãããããœãŒã¹ã³ãŒãã¯åºæ¬ããã¹ãŠGitHubã§å ¬éããŠãããåé¡ã課é¡ãçºçãããšæå¿ãã¯ã€ãã¯ã«è§£æ±ºããŠããããOSSãªãã§ã¯ã®ã³ãã¥ããã£ãããã
APIãå ¬éããŠãããããæ§ã ãªé¢é£ããŒã«ãšã®é£æºãçµ±åãšãã£ããšã³ã·ã¹ãã ã®æ§ç¯ã«ãããŠããæ¢åã»åŸæ¥ã®ããŒã¿è³ç£ãªãã³ã«ããŒã«ã掻çšãæµçšããããšãã§ããã

æåŸã®ç¹åŸŽã¯ãã³ã©ãã¬ãŒãã£ããã§ããããœãããŠã§ã¢éçºã«ãããŠã¯ãããŒã éã®ããåããã¹ã ãŒãºã«ãããåçš®æ©èœãããŒã«ãæã£ãŠããããã®ããŒã¿æŽ»çšçãšèããã°ããã ããã
äŸãã°ãAIãããžã§ã¯ãã«ãããŠãã¡ã³ããŒã®1人ã§ããããŒã¿ãµã€ãšã³ãã£ã¹ããããŒãããã¯ã§ããžãã¯ãçµãã ãšãããDatabricksã§ã¯ãã¢ãã«ã»ããã·ã¥ããŒãã»ããŒã¿ã»ãããšãã£ãåçš®èŠçŽ ã¯ãã¹ãŠãã«ããŠãŒã¶ãŒã§ã®å ±æãåæãšããŠãããããéçºã¡ã³ããŒã§ããã°èª°ã§ãå ±æã§ããéçºãã¹ã ãŒãºã«é²ããšããããã ã
ããŒã¿å©æŽ»çšã«æºããæ§ã ãªããžã·ã§ã³ã®äººã«äŸ¡å€ããæ©èœãæäŸ
ããŒã¿ãã€ãã©ã€ã³ã®éçºãETLåŠçãBIããã·ã¥ããŒãã®äœæãªã©ãã倿§ãªè·çš®ã«ãããå€ãã®åãçµã¿ã«ãããŠã1ã€ã®ããŒã¿ã䜿ã£ãŠDatabricksäžã§ãã¹ãŠå®è¡ã§ããããã«ãªã£ãŠããããªããã¯ãŒã¯ããŒããšã¯ãããããžã§ããåãçµã¿ãæå³ããã

ããŒã¿ã¬ããã³ã¹ã«ãããŠã¯ãUnity Catalogãšããæ©èœãæ ä¿ããããã® Unity Catalogã«ãã£ãŠãDatabricksã¯ã¢ã¯ã»ã¹ãããã¹ãŠã®ããŒã¿ããã¡ã€ã«ãã©ãã«ããã®ããå©çšè ã«æäŸããããšãã§ããã

ä»äŒæ¥ãšã®ããŒã¿å ±æãªã©ã«ãããŠããDatabricksã§ããã°èªãéçºããããŒã¿å ±æã®ããã®ãããã³ã«ãDelta Sharingãã䟡å€ã瀺ããDelta Sharingãå©çšããŠãããã©ãããã©ãŒã ã§ããã°ãç°ãªã£ãŠããŠãããŒã¿ã®å ±æãã§ããããã ã
ãäŸãã°Power BIãTableauãšãã£ãBIããŒã«ãããã«ã¯Pythonã®Pandasãªã©ããããçŽæ¥çããŒã¿ã«ã¢ã¯ã»ã¹ããããšãã§ããŸããã¢ã¯ã»ã¹ããéã«ãããã£ãããšã»ãã¥ãªãã£ãæ ä¿ããŠããã®ãç¹åŸŽã§ããïŒåŒ¥çæ°ïŒ
åŒ¥çæ°ã¯ããããã®è·çš®ã®åçš®åçµã¿ã«ãããŠãDatabricksãã©ã®ããã«æŽ»çšãããŠããã®ããå®éã®GUIãããã·ã¥ããŒããšåãããŠç޹ä»ããã
åŒ¥çæ°ã¯ãæ©æ¢°åŠç¿ã§ã®å©çšã«ã€ããŠã¯åµæ¥åœåããæèããŠããããã§ãDatabricksã®èãã»å§¿å¢ã次ã®ããã«èªã£ãŠããã
ãåã«ã¢ãã«ãäœã£ãŠçµããã§ã¯ãªããããžãã¹ããã»ã¹ã«å®è£ ããããžãã¹ãžã®äŸ¡å€ãã€ã³ãã¯ããåºããšãããŸã§ãµããŒãããªããã°ãããªãããããæã ã®èãã§ãããå²åŠã§ããããŸããïŒåŒ¥çæ°ïŒ

ãã®ãããªå²åŠã«åºã¥ããããããMLOpsã®æµããå®çŸããããã®åçš®æ©èœãæäŸããŠãããäŸãã°ãã¢ãã«ã®ããŒãžã§ã³ãã¹ããŒã¿ã¹ç®¡çã ãæå®ããŠãã©ã¡ãŒã¿ãã©ããã£ããã®ã ã£ãã®ãããã®ãã©ã¡ãŒã¿ã§ã®ç²ŸåºŠã¯ã©ãã ã£ãã®ããªã©ããèªåã§ã¬ã³ãŒããããŠããããåçŸæ§ã«åãå ¥ããŠããŸãããšãåŒ¥çæ°ã¯åŒ·èª¿ããã
ã¬ãŒãããŒæ°ãæ°å¹Žåã«æå±ãããã·ããºã³ããŒã¿ãµã€ãšã³ãã£ã¹ããããããããžãã¹ãµã€ãå¯ãã®ã¡ã³ããŒã§ããæ©æ¢°åŠç¿ã¢ãã«ãäœãããããªAutoMLæ©èœãåçš®æãããªã©ããŸãã«ããŒã¿ã«é¢ããæ§ã ãªããžã·ã§ã³ã®äººãã¡ã«åããŠãæçã«æ©èœãæäŸããŠããããšã蚎ããã
ãªããAutoMLæ©èœã§äœæããã¢ãã«ã§ãã£ãŠããDatabricksã§ã¯è£åŽã§ããžãã¯ãã³ãŒããPythonã®ããŒãããã¯ã§äœæãããããã«ãªã£ãŠããããããããã®ãœãŒã¹ã䜿ã£ãŠãã¢ãã«ãã«ã¹ã¿ãã€ãºãããããªå©çšãå¯èœã ã
ããŒã¿ãšã³ãžãã¢ãªã³ã°ã§ã¯ãè€éã«ãªããã¡ãªããŒã¿ãã€ãã©ã€ã³ã®äœæãã·ã³ãã«ããã€ã¹ããŒãã£ãŒã«è¡ããDelta Live TablesïŒDLTïŒããšããæ©èœãåãããåæ©èœãç°¡åã«èª¬æããã°ãè€éã«ãªãçç±ã§ãããããŒã¿ãã€ãã©ã€ã³å士ã®äŸåé¢ä¿ããDLTãè§£æ¶ããŠãããã
以äžã¹ã©ã€ãã®ããã«ã°ã©ãæ§é ãèªåã§çæãããããããªã¢ã«ã¿ã€ã ã®å®è¡ç¶æ³ãèŠèŠçã«ç¢ºèªã§ãããªã©ãéçšã«ãããç®¡çæ©èœãæããã
BIããã·ã¥ããŒããåœç¶åããŠãããããŒã¹ãšãªã£ãŠããã®ã¯Redashã§ãããæ°å¹Žåã«Databricksãå瀟ãè²·åãçŸåšã¯Databricks SQLãšããŠæäŸããŠããã

åŒ¥çæ°ã¯Databricksã®ç¹åŸŽãæ©èœã玹ä»ããåŸãæ¹ããŠDXæšé²ã«ãããäŒæ¥ã®èª²é¡ã«å¯ŸããŠãDatabricksãå©çšããããšã§ã©ã®ããã«è§£æ±ºã§ããã®ãããããããã®é åã§è§£èª¬ããã
ãŸãã¯çµç¹ã§ãããã³ã©ãã¬ãŒã·ã§ã³æ©èœãããããšã§ããããŸã§ã¯èš±å¯ãåã£ãŠãããããªåãçµã¿ãã¹ã ãŒãºããã€ã¹ããŒãã£ãŒã«é²ããããšãå¯èœã«ãªããDatabricksã®æéäœç³»ã¯èšç®è³æºã®å©çšã«æºãããããå°å ¥ããŠãå©çšããªããã°è²»çšã¯0åãã¹ã¢ãŒã«ã¹ã¿ãŒãã«ã¯å¬ããæ©èœã ã
ã·ã¹ãã ã§ã¯ãå€§èŠæš¡èšèªã¢ãã«ã®çæã«å¿ é ãªGPUãæ¡çšãDatabricksã§ããã°ããã·ã¥ããŒãäžã®ã€ã³ã¹ã¿ã³ã¹ã®ã¿ã€ããéžã¶ã ãã§ãGPUã®å®è¡ç°å¢ãããã«ç«ã¡äžãããåçš®ã¯ã©ãŠããµãŒãã¹ã®å©äŸ¿æ§ã«è¿ããšèšããã ããã
æåŸã¯äººã§ãããDatabricksã¯æ§ã ãªæ©èœãæããŠãããããæ¬¡ã®ãããªå¹æããããšãåŒ¥çæ°ã¯ãŸãšããã
ãããŒã¿ãæ±ãåè·çš®ã®çç£æ§ãåäžããããšã§ãçµæãšããŠåŸæ¥å¡ãšã³ã²ãŒãžã¡ã³ããåäžããŸããDatabricksãå°å ¥ããããšã§ãé¢è·çãäœäžããŠãããšãã£ã广äŸãã¯ã©ã€ã¢ã³ãããèãããšããããŸããïŒåŒ¥çæ°ïŒ
ããã«åœå ã®ã¯ã©ã€ã¢ã³ãããã¯ããšã³ãžãã¢ã®å€ããå€§èŠæš¡èšèªã¢ãã«ãªã©ã®ææ°æè¡ã掻çšã§ããç°å¢ã§åããããšèããŠããããåç°å¢ãåããDatabricksã䜿ããç°å¢ã®æç¡ãããšã³ãžãã¢ã®åãåºæºã«ãªã£ãŠããã±ãŒã¹ãããããšãç¶ããã

åŒ¥çæ°ã¯å®éã®å°å ¥äºäŸã玹ä»ãããç°èŸºäžè±è£œè¬ã§ã¯ãåŸæ¥ã®ãªã³ãã¬ç°å¢ã§ã¯æ±ããé£ããã£ã1ãã©ãã€ããè¶ ããããã°ããŒã¿ãåŠçã§ããããã«ãªã£ããå ããŠããããŸã§åŒ¥çæ°ãè¿°ã¹ãŠãããããªåçš®æ©èœã«ãããã³ã©ãã¬ãŒã·ã§ã³ãªãã³ã«ã¡ã³ããŒéã®ã³ãã¥ãã±ãŒã·ã§ã³ã掻çºã«ãªã£ãã
åŒ¥çæ°ã¯ã次ã®ããã«Databrickså°å ¥ã®ææãè¿°ã¹ãŠããã
ãçç£æ§ã®åäžã»ã³ãã¥ãã±ãŒã·ã§ã³ãåæ»ã«ãªã£ãããšã§ãæ°èŠã®ããŒãã«æ¬¡ã åãçµãã§ãããšãã£ãããã©ã¹ã®ãµã€ã¯ã«ãåºãŠããŠãããšæããŠããŸããïŒåŒ¥çæ°ïŒ

ããŒã¿ã®æ°äž»åãå®çŸããã·ã³ãã«ãã€çµ±åçãªãã©ãããã©ãŒã ãå®çŸ
ç¶ããŠã¯ãä»ã®ã¯ã©ãŠããµãŒãã¹ãšã®éãã«ã€ããŠãèªããããã¯ã©ãŠãäºæ¥è ãæäŸãããµãŒãã¹ã§ã¯ãããŒã¿ã®å å·¥ãã¢ãã«äœæãšãã£ããžã§ããè¡ãéã¯ãåçš®ãµãŒãã¹ãçµã¿åãããŠè¡ãå¿ èŠããããããã©ãããŠãè€éã«ãªããã¡ã ã

DWHãªã©ãç¹å®æ©èœã«ç¹åããŠãããµãŒãã¹ãåæ§ã ã察ããŠDatabricksã¯çããŒã¿ãäžã€ã ãã§ãããã€ã³ã¿ãŒãã§ãŒã¹ãããŒãããã¯ãããã·ã¥ããŒãã ããªã®ã§ãéåžžã«ã·ã³ãã«ã ã

ãã®çµæãããžãã¹ãµã€ãã®äººã§ã簡䟿ã«ããŒã¿ãæ±ãããããããããŒã¿ã®æ°äž»åãå®çŸããŠãããå®éã®å°å ¥äºäŸã玹ä»ãããã

Unity Catalogã䜿ãããšã§ãç°å¢æ§ç¯ãªãã³ã«éçšã容æã«ãªããã¯ãŒã¯ã¹ããŒã¹ãåå²ãããšç®çæ§ã¯é«ãŸããããŠãŒã¶ãŒç®¡çãªã©ã倧å€ã§ãã£ãããããŒã¿ããµã€ãåããã¡ã ãããããã®èª²é¡ãè§£æ¶ãããããã ã
3éå±€ã§ããŒã¿ãæŽçã管çããããšãã§ãããããéçºãã§ãŒãºãããžãã¹èŠä»¶ãªã©ã«ãã£ãŠåããããšãã§ããããã®ãããªç®¡çãè¡ããã
Databricksã§ã¯çããŒã¿ãããã³ãºãã¯ã¬ã³ãžã³ã°ãããããŒã¿ãã·ã«ããŒãBIãªã©ã§äœ¿ããç¶æ ã®ããŒã¿ããŽãŒã«ããšæŽçãããã¡ããªãªã³ã¢ãŒããã¯ãã£ããšããæŠå¿µãããã

ããŒã¿ã¢ããªã³ã°ã«ã€ããŠãèšåãããç¹°ãè¿ãã«ãªãããDWHãšããŠã®æ©èœãæããŠãããããã¹ã¿ãŒã¹ããŒããData Vaultãšãã£ããããŒã¿ã¢ããªã³ã°ãè¡ãéã«å¿ èŠãªãåçš®æ©èœãæäŸããŠããã
å è¿°ãããšãããªãŒãã³ãªãšã³ã·ã¹ãã ã§ãããããæ°å€ãã®é¢é£ããŒã«ãšé£æºããããã®äžãããªããŒã¹ETLã®äžã€ã§ããHightouchã«ãããæŽ»çšããããã®é¢é£è£œåãšã®é£æºãã¹ã ãŒãºã«ããPartner Connectæ©èœãããã«ã¯ããŒã±ãããã¬ã€ã¹ã®ãªãŒãã³ããŒã¿ã«ã¢ã¯ã»ã¹ã掻çšã§ããMarketplaceãšãã£ãæ©èœãæããŠããããšãä»ãå ããåŒ¥çæ°ã¯ã»ãã·ã§ã³ãçµããã

ïœïœæ ªåŒäŒç€Ÿç€Ÿå¡ãçŸå Žã§æããŠããæ©ã¿ã»Databricksã®æŽ»çšèª²é¡ã«åŒ¥çæ°ãåç
ç¶ããŠã¯ãïœïœæ ªåŒäŒç€Ÿã®ã¯ã©ã€ã¢ã³ãã»ã·ã¹ãã éçºäºæ¥éš ããžãã¹ãœãªã¥ãŒã·ã§ã³ã°ã«ãŒãã®ã¡ã³ããŒãåå ããŠQ&Aã»ãã·ã§ã³ãè¡ããããïœïœæ ªåŒäŒç€Ÿã§ã¯æè¿ç¹ã«AIãªã©ããŒã¿ãåæã婿޻çšãããããžã§ã¯ããå€ããDatabricksã®æŽ»çšãå€ããšãããå®éã«ã¡ã³ããŒãçŸå Žã§æããæ©ã¿ããDatabricksãããã«æŽ»çšãããã³ããªã©ã質åããåŒ¥çæ°ãåçããã
QïŒã¯ã©ãŠããµãŒãã¹ã«ããæ©èœã®éããå°å ¥å²åãä»åŸã®å±éã«ã€ããŠèãããŠãã ãã
åœå ã§ã¯AWSãšAzureãã»ãŒåãå°å ¥çã§ããã®åŸã«GCPãç¶ããŠããå°è±¡ã§ããAWSã¯åœå ã§æãå©çšãããŠããã¯ã©ãŠããµãŒãã¹ã§ãããAzureã远ãäžããŠããèæ¯ãçç±ã ãšæããŸãã
Azure Databricksã¯ãã€ã¯ããœãããæäŸããŠãããã¡ãŒã¹ãããŒãã£ãŒãµãŒãã¹ã§ããããŸããã¯ã©ãŠãã«ããæ©èœã®éãã¯åºæ¬ãããŸããããAWSã«ç¶ããŠAzureãæåŸã«GCPãšããè¿œå ææã®éãã¯ãããŸãã
ä»åŸã®å±éã§ãããçŸç¶3ã€ã®ã¯ã©ãŠããµãŒãã¹ã«ãã©ãŒã«ã¹ããŠããæ¹åæ§ã®ã¿ã§ãä»ã«å±éããèšç»ã¯æªå®ã§ãã
QïŒä»åŸããã«æ³šåããé åãä»ã®ãµãŒãã¹ãšå·®å¥åããŠããããé åã¯ïŒ
ã¯ã©ãŠããµãŒãã¹æ¥çã¯ç«¶äºãæ¿ãããããæã ãçµ±åãã©ãããã©ãŒã ãšããç¹åŸŽãæã¡åºãè©äŸ¡ãåŸãããã«ãªããšãä»ç€Ÿãåãããã«çµ±åãæèãããµãŒãã¹ã«é²ãã§ããç¶æ³ã§ããæã ãšããŠã¯å€§èŠæš¡èšèªã¢ãã«ãAIã®åãçµã¿ã«å¯Ÿããã¢ã·ã¹ã¿ã³ãæ©èœã«æ³šåããŠããç¶æ³ã§ãã
QïŒDollyãªã©å€§èŠæš¡èšèªã¢ãã«ã§ã®ãŠãŒã¹ã±ãŒã¹ã¯ïŒ
Dollyã«éãããå€§èŠæš¡èšèªã¢ãã«ã«ããããŠãŒã¹ã±ãŒã¹ã¯å€æ°ãããŸãããã QAããããæç®èŠçŽãªã©ãããããäžã®äžã§ç޹ä»ãããŠãããããªãå€§èŠæš¡èšèªã¢ãã«ã®ãŠãŒã¹ã±ãŒã¹ãšå€ãããŸããã
äžæ¹ã§ãèªç€Ÿã®ã¢ããªã«å€§èŠæš¡èšèªã¢ãã«ã掻çšãããã£ããããããéçºããããšã®åãçµã¿ãæ¯æŽãããããªæ©èœãæããŠãããããŸããã¢ãã«ã«å¯Ÿãããã¹ããªAPIãäœãããšãã§ãããã¢ãã«ãµãŒãã³ã°ãã§ãã
åæ©èœã«ããããã³ãããšã³ãžãã¢ãªã³ã°ãè¡ããªã©ããŠãåžæããæåãè¡ãã¢ããªãäœããããªãŠãŒã¹ã±ãŒã¹ããããŸãããŸãã»ãã¥ãªãã£ãšãã芳ç¹ããããDatabricksã®æ©èœã䜿ãã±ãŒã¹ããããªã©ãã客æ§ã®ç¶æ³ã«ãã£ãŠæ§ã ãªãŠãŒã¹ã±ãŒã¹ããããŸãã
QïŒãŠãŒã¶ãŒãæ¢åã¯ã©ãŠãããDatabricksã«ç§»è¡ããæ±ºãæã¯ïŒ
ã³ã¹ãããªãŒãã³æ§ã«å ããå€§èŠæš¡èšèªã¢ãã«ã§ã®æŽ»çšãªã©ã§ããDWHãšããŠã®æ©èœä»¥å€ã®ãŠãŒã¹ã±ãŒã¹ãæ¡åŒµæ§ãå°æ¥æ§ãè©äŸ¡ããŠåãæ¿ããã±ãŒã¹ãäžçªå€ããšæããŠããŸãã
å人çãªæèŠã«ãªããŸãããDWHåå£«ã®æ©èœã®å·®ã¯ããã»ã©ãªãã®ã§ãæ¯èŒããŠãããŸãæå³ããªããšæããŸãããŸãçŸåšã¯ãDWHã ãã§äŒæ¥ã®èª²é¡ã解決ã§ããç¶æ³ã§ããªããšãèããŠããŸãã
QïŒDatabricksã®äžéšæ©èœãæ¢åãã©ãããã©ãŒã ã«è¿œå ããã±ãŒã¹ã¯ãããïŒ
ãã¡ãããããŸããäŸãã°ãããŒã¿ããŒã¹ã®äžã®ã¹ããŒãã¬ãã«ã§ã®ç§»è¡ã§ããã¯ãšãªã®ãã§ãã¬ãŒã·ã§ã³æ©èœããµããŒããå§ãããããããŒã¿ãœãŒã¹ã¯ãã®ãŸãŸã«ããŠãããDatabricksããã¢ã¯ã»ã¹ãããŒã¿åæãé²ããããã®äžããã³ã¹ãã¡ãªãããåºãããªããŒã¿ãœãŒã¹ãåŸã ã«ç§»è¡ããŠããããã®ãããªã¢ãããŒããåããããã«ãªã£ãŠããŠããŸãã
DWHã®äžã§ETLåŠçãè¡ã£ãŠããã±ãŒã¹ãå°ãªããããŸããããããããã®ãããªåŠçã§ã¯ã³ã¹ããããªãããããããETLåŠçã¯å°éã®ããŒã«ã«ãªãããŒããã¹ãã ãšææ¡ããŠããŸããå®éããªãããŒãããããšã§ã³ã¹ãã¡ãªãããæããã客æ§ããããŒã¿ç§»è¡ãåŸã ã«é²ããã±ãŒã¹ããããŸãã
QïŒäžè¬çãªããŒã¿ãã©ãããã©ãŒã ã§ã¯ãè€è£œããæé¢ãç°ãªãããŒã¿ãå€ããããã¢ãããŒãã®è§£æ±ºæ¹æ³ãç¥ããã
Databricksã§ããã°ããŒã¿ã®ããŒãžã§ã³ã管çããæ©èœããããããä»»æã®ã¿ã€ãã³ã°ã§ç¹å®æé¢ã®ããŒã¿ãç°¡åã«åŒã³åºãããšãã§ãããããæ¯åæåã§æé¢ããŒã¿ãäœãæéã¯çºçããŸããã
QïŒããŒã¿ãªãœãŒã¹ãæ¯æžãããªãããã®å·¥å€«ã¯ïŒ
ç¡éã«æµã蟌ãã§ããããŒã¿ããã®ãŸãŸåŠçããŠããŠã¯ãã¡ã¢ãªããªãŒããŒãããŒããŠããŸããããã°ã«ãŒãã³ã°åŠçãªã©ã®å·¥å€«ãããŠããŸããå ·äœçã«ã¯ãApache Sparkã®ã¹ããªãŒãã³ã°åŠçãèåŸã§åããŠããããã®äžã«ãŠã©ãŒã¿ãŒããŒã¯ãšããæ©èœããããŸãã
ãŠã©ãŒã¿ãŒããŒã¯ã¯äœåå以äžã®ããŒã¿ã¯åŠçããªããšèšå®ã§ããæ©èœããããå€ãããŒã¿ãç Žæ£ããããšã§ãã¡ã¢ãªã®å§è¿«ãé²ãã§ããŸãã
QïŒãšã³ã·ã¹ãã ãšã®é£æºã«ããããã¹ããã©ã¯ãã£ã¹ã¯ïŒ
TableauãPower BIãªã©ãBIããŒã«é¢é£ãé¡èã§ããã®ãŸãŸå©çšããŠããã ããã±ãŒã¹ãå€ãã§ããBIããŒã«ä»¥å€ã§ã¯Airflowãå€ãã£ãã®ã§ãããçŸåšã§ã¯Databricksãé²åããäžéšã®é«åºŠãªæ©èœãé€ããAirflowã§ã§ããããšã¯ã»ãŒç¶²çŸ ããŠããç¶æ³ã§ããETLãŸããã®é£æºãå€ãã§ãã
ãQïŒAãåå è ããã®è³ªåã«ç»å£è ãåç
ã€ãã³ããèŽè¬ããåå è ããã質åãå¯ãããããããã€ã玹ä»ããã
QïŒä»ã®ããŒã¿ã¬ã€ã¯ããŠã¹ãšæ¯ã¹ãŠDatabricksãåªããŠããç¹ã¯ïŒ
ããŒã¿ã¬ã€ã¯ããŠã¹ã ãšæãããŠãããã©ãããã©ãŒã ããå®éã«ã¯DWHã§ãããšããããšãå°ãªããããŸãããäžæ¹ã§ãDWHãšããŠæ¯èŒããå Žåã«ã¯ãããã»ã©éãã¯ãªããšèããŠããŸããæ¯ã¹ããšããããã³ã¹ãããã©ãŒãã³ã¹ãšããããšã«ãªãã§ãããããã ãã¯ãŒã¯ããŒãã«ãããåŸæäžåŸæããããå®éã«æ€èšããããéã«ã¯PoCã®å®æœãªã©ãè¡ãã±ãŒã¹ããããŸãã
ãã ãDWHã®çœ®ãæããšãã芳ç¹ã§Databricksãæ€èšããã ãã®ã¯ãæ£çŽãã£ãããªããšèããŠããŸããçµ±åçã§ããããšãä»ã®ãšã³ã·ã¹ãã ãšã®é£æºãªã©ããã以å€ã®è±å¯ãªæ©èœãå«ãè©äŸ¡ããŠããã ãã®ãé©åã ãšæãããã§ãã
QïŒããŒã¿ã¬ã€ã¯ãDWHãããŒã¿ããŒãããã®ç§»è¡ãµããŒãã«ã€ããŠ
ã©ãããµããŒãããŠããŠãäžçªå®¹æãªã®ã¯ããŒã¿ã¬ã€ã¯ã«ãªããŸããäŸãã°ãAmazon S3ã§ä¿ç®¡ããŠããããŒã¿ã§ããã°Databricksã«å®¹æã«ã€ãªãããšãã§ãããããç§»è¡äœæ¥ãšããã»ã©ã®ã¬ãã«ã§ã¯ãªããããã«äœ¿ãå§ããããšãã§ããŸããDWHã«ãããŠãèªåã§ç§»è¡ããããŒã«ãå«ããå皮移è¡ãµãŒãã¹ãæäŸã»ãµããŒãããŠããŸãã
ïœïœæ ªåŒäŒç€Ÿ
https://www.skygroup.jp/
ïœïœæ ªåŒäŒç€Ÿã®æ¡çšæ
å ±
https://www.skygroup.jp/recruit/
ããããã€ãã³ã
é¢é£ããã€ãã³ã

ããŒã¿æŽ»çšã®åŒ·ã峿¹ãDatabricksãã§å®çŸããããŒã¿åºç€æ§ç¯ã»...
2023幎08æ02æ¥ (æ°Ž)ããããã®èšäº

DXæä»£ã«æ±ããããéæ§é åïŒé³å£°ã»æç³»åïŒããŒã¿ãšã¯ ââã¯ã©ãŠããšãŒã¹ã»PERSOLã»ã»ã³ã ã®ææ°æŽ»çšäºäŸ

KDDI Digital Divergence Groupã®ãããã§ãã·ã§ãã«ãèªããããŒã¿å©æŽ»çšãç»ååæAIéçºçŸå Žã§çŽé¢ãã課é¡è§£æ±ºã®æ¹æ³









