Pythonã§æ£èŠè¡šçŸã䜿ã£ãŠã¿ããïŒ
Pythonã§æ£èŠè¡šçŸãå®è£ ããã®ã¯é£ãããªãïŒæååããe-mailãé»è©±çªå·ã®æ€çŽ¢ãæœåºã眮æã容æã«ã§ããŸããæ£èŠè¡šçŸãã¿ãŒã³ãäœæããããã®ãã¡ã¿æåãç¹æ®ã·ãŒã±ã³ã¹ãç¹æ®æåã®ãšã¹ã±ãŒãã®ä»æ¹ãå®è£ ããããã«å¿ èŠãªreã¢ãžã¥ãŒã«ããã®ã¢ãžã¥ãŒã«ã«å«ãŸãã颿°ã玹ä»ããŸããæ§ã ãªããã°ã©ãã³ã°èšèªã«ãããŠãæ£èŠè¡šçŸã掻çšãããŠããŸããæ£èŠè¡šçŸãšã¯äœã§ããããïŒ
Pythonã«ãããŠãæ£èŠè¡šçŸãå®è£
ãããããã©ããªã¢ãžã¥ãŒã«ã»é¢æ°ãå¿
èŠã§ããããïŒå®éã«ãæ£èŠè¡šçŸã䜿ã£ãŠãæååããæºåž¯çªå·ãæœåºãããµã³ãã«ã³ãŒãã解説ããŸãã
Pythonã§æ£èŠè¡šçŸã䜿ã£ãŠã¿ãŸãããïŒ
æ£èŠè¡šçŸã£ãŠäœïŒ
æ£èŠè¡šçŸãšã¯ãæååå ã§æåã®çµã¿åãããç §åããããã«çšãããããã¿ãŒã³ã§ããè±èªã§ã¯ãâregular expressionsâïŒREs ã regexes ãŸã㯠regex patternsïŒãšåŒã°ããŠããŸãã
æ£èŠè¡šçŸã䜿ããšãæååã®äžãããç¹å®ã®ãã¿ãŒã³æååïŒe-mailã¢ãã¬ã¹ãé»è©±çªå·ãªã©ïŒãæ€çŽ¢ããæœåºããããšãã§ããŸãã
å
·äœçã«ã¯ä»¥äžã®ãããªåŠçã«ããçšããããŠããŸãã
- æååã®ãã¿ãŒã³ãããã³ã°ã»çœ®æ
- ããã¹ãè§£æ
- ããŒã¿ã®ããªããŒã·ã§ã³
æ£èŠè¡šçŸã®èšæ³
æ£èŠè¡šçŸã¯ãã¡ã¿æåãšç¹æ®ã·ãŒã±ã³ã¹ããŸãç¹æ®èšå·èªèº«ã衚ãããã«ãšã¹ã±ãŒããçšããããŸãã
æ£èŠè¡šçŸã®ã¡ã¿æå
æ£èŠè¡šçŸã¯ãããã€ãã®ç¹å¥ãªæåâã¡ã¿æåâã«ãã£ãŠè¡šçŸãããŸããã¡ã¿æåã¯ãããããæå³åããä»äžãããŠããŸããäŸãã°ãã$ãã«ã¯ãæååã®æ«å°Ÿãšããæå³åãããããŸãã
æååâEnjoy your adventure, find more about Japanâã«ãããŠãæ£èŠè¡šçŸãã¿ãŒã³ãre$ããæå®ããå ŽåãæååâadventureâãšâmoreâãæœåºãããŸãã
䜿çšé »åºŠã®é«ãã¡ã¿æåãåæããŸãã
| ã¡ã¿æå | ãããããæåå | äŸ | æœåºæåå |
|---|---|---|---|
| . | æ¹è¡ä»¥å€ã®ä»»æã®æåã«ããã | a.c | abc, aac, acc, a1c ⊠|
| ^ | æååã®å é ã«ããã | ^ab | abc, abd, ab1 ⊠|
| $ | æååã®æ«å°Ÿã«ããã | yz$ | xyz, yyz, 1yz ⊠|
| * | çŽåã®æ£èŠè¡šçŸã 0 å以äžç¹°ãè¿ãããã®ã«ããã | ab* | a, ab, abb, abbb ⊠|
| + | çŽåã®æ£èŠè¡šçŸã 1 å以äžç¹°ãè¿ãããã®ã«ããã | ab+ | ab, abb, abbb ⊠|
| ? | çŽåã®æ£èŠè¡šçŸã 0 åã 1 åç¹°ãè¿ãããã®ã«ããã | ab? | a, ab |
| {m} | çŽåã®æ£èŠè¡šçŸãã¡ããã© m åç¹°ãè¿ãããã®ã«ããã | a{3} | aaa |
| {m,n} | çŽåã®æ£èŠè¡šçŸã m åãã n åãã§ããã ãå€ãç¹°ãè¿ãããã®ã«ããã | a{2,4} | aa, aaa, aaa |
| [ ] | æåã®éåãæå®ãããã«ãã³å ã«æå®ããæåã®ãã¡ãããããã®äžæå | [amk][x-z] | a, m, kx, y, z |
| AïœB | AïœB 㯠A ãš B ã®ããããã«ããã | aïœb | a, b |
| ( ) | ã°ã«ãŒãã®éå§ãšçµäºãäžžæ¬åŒ§ã§å²ãŸããæ£èŠè¡šçŸã«ããã | (xyz)+ | xyz, xyzab |
æ£èŠè¡šçŸã®ç¹æ®ã·ãŒã±ã³ã¹
| ç¹æ®ã·ãŒã±ã³ã¹ | ãããããæåå | æ£èŠè¡šçŸã§ã®è¡šãæ¹ |
|---|---|---|
| \d | ä»»æã®æ°åã«ããã | [0-9] |
| \D | ä»»æã®æ°å以å€ã«ããã | [^0-9] |
| \s | ä»»æã®ç©ºçœæåã«ããã | [\t\n\r\f\v] |
| \S | ä»»æã®ç©ºçœæå以å€ã«ããã | [^\t\n\r\f\v] |
| \w | ä»»æã®è±æ°åã«ããã | [a-xA-Z0-9_] |
| \W | ä»»æã®è±æ°å以å€ã«ããã | [^a-xA-Z0-9_] |
ãŸããã¡ã¿æåãšç¹æ®ã·ãŒã±ã³ã¹ãªã©ãç¹æ®æåèªäœã衚çŸããå Žåããï¿¥ãã§ãšã¹ã±ãŒãããŸããäŸãã°ããïŒãïŒããªãªãïŒã衚çŸããæã¯ããï¿¥ïŒããšè¡šèšããšã¹ã±ãŒãããŸããããã§ãããªãªããæå³ããŸãã
ãã®ä»ã®ç¹æ®æåããåãæ¹æ³ã§ãšã¹ã±ãŒãã§ããŸãã
pythonã§æ£èŠè¡šçŸã䜿ãã«ã¯
æ£èŠè¡šçŸãå®è£ ããã«ã¯ãreã¢ãžã¥ãŒã«ãå¿ èŠïŒ
Pythonã§ã¯ãreã¢ãžã¥ãŒã«ãã€ã³ããŒãããäºã«ãã£ãŠãæ£èŠè¡šçŸãå®è£ ã§ããŸãããã®ã¢ãžã¥ãŒã«ã«ã¯ãæååã®æ€çŽ¢ã眮æãé£çµãåå²ãªã©ãæååãæ±ãäžã§åœ¹ç«ã€ã¡ãœãããå®è£ ãããŠãããæ£èŠè¡šçŸã«ãããããæååããŸããããããæ å ±ããªããžã§ã¯ããšããŠè¿ããŸãã詳ããã¯ãã¢ãžã¥ãŒã«ã³ã³ãã³ãã®å ¬åŒããã¥ã¡ã³ããåç §ããŠãã ãããæ¬ç« ã§ã¯ãPythonã§å®è£ ããããã®ã¢ãžã¥ãŒã«ãšãå颿°ã®è§£èª¬ãè¡ããŸãã
ã§ã¯ãreã¢ãžã¥ãŒã«ãã€ã³ããŒãããŠã¿ãŸãããã
import re
ç°¡åã§ããã
ã§ã¯ããã®ã¢ãžã¥ãŒã«ã«å«ãŸããäž»ãªé¢æ°ãã¡ãœãããèŠãŠãããŸãã
| 颿° | åŠç | è¿ãå€ |
|---|---|---|
| match | å é ã§äžèŽãããã®ãæ€çŽ¢ | 察å¿ããããããªããžã§ã¯ãâ» äžèŽããªãå Žåã¯None |
| search | å é ã«éããäžèŽãããã®ãæ€çŽ¢ | 察å¿ããããããªããžã§ã¯ãâ» äžèŽããªãå Žåã¯None |
| findall | äžèŽãããã®å šãŠãæ€çŽ¢ | ãããããæååå šãŠã®ãªã¹ãâ» äžèŽããªãå Žåã¯ç©ºã®ãªã¹ã |
| finditer | äžèŽãããã®å šãŠãæ€çŽ¢ | 察å¿ããå šãŠã®ããããªããžã§ã¯ãâ» äžèŽããªãå Žåã¯None |
| fullmatch | å®å šäžèŽãããã®ãæ€çŽ¢ | 察å¿ããããããªããžã§ã¯ãâ» äžèŽããªãå Žåã¯None |
| sub | äžèŽããæååã眮æ | 眮æåŸã®æåå |
match 颿°
re.match(pattern, string, flags=0)
ä»»ææååã®âå é âã§æ£èŠè¡šçŸãã¿ãŒã³ã«ãããããå Žåã察å¿ããããããªããžã§ã¯ããè¿ããŸããæååããæ£èŠè¡šçŸãã¿ãŒã³ã«ãããããªãå ŽåãNoneãè¿ããŸãã
ãã®é¢æ°ã¯ãä»»ææååã®âå
é âã«æœåºã¯ãŒãããããã©ãããå€å®ããŸãã
æååå
šäœã§ãæ£èŠè¡šçŸãã¿ãŒã³ã«ãããããããå Žåã¯ã次ã®search颿°ã䜿çšããŸãã
search 颿°
re.search(pattern, string, flags=0)
ä»»ææååå šäœãèµ°æ»ããæ£èŠè¡šçŸãã¿ãŒã³ããããããæåã®å Žæãæ¢ãã察å¿ããããããªããžã§ã¯ã ãè¿ããŸããæ£èŠè¡šçŸãã¿ãŒã³ããããããªãå ŽåãNone ãè¿ããŸãã
ãã®é¢æ°ã¯ãä»»ææååäžã«ãæœåºã¯ãŒãããããã©ãããå€å®ããŸããæååå šäœã§ãæ£èŠè¡šçŸãã¿ãŒã³ãâè€æ°âãããããããå Žåã¯ã次ã®findall颿°ã䜿çšããŸãã
findall 颿°
re.findall(pattern, string, flags=0)
ä»»ææååäžå šäœãèµ°æ»ããæ£èŠè¡šçŸãã¿ãŒã³ã«ãããããæååããèŠã€ãã£ãé çªã§ããªã¹ããšããŠè¿ããŸãã
è€æ°ã®e-mailã¢ãã¬ã¹ãé»è©±çªå·ããªã¹ããšããŠæœåºããå ŽåããšãŠã圹ã«ç«ã€é¢æ°ã§ãã
finditer 颿°
re.findall(pattern, string, flags=0)
ä»»ææååäžå
šäœãèµ°æ»ããæ£èŠè¡šçŸãã¿ãŒã³ã«ãããããå Žåã察å¿ããããããªããžã§ã¯ãå
šãŠãè¿ããŸããæ£èŠè¡šçŸãã¿ãŒã³ããããããªãå ŽåãNone ãè¿ããŸããfindall ã§ã¯è¿ãå€ããªã¹ãã§ãããfinditer ã§ã¯ããããªããžã§ã¯ããè¿ã£ãŠãããããéå§äœçœ®ãçµäºäœçœ®ãååŸããããšãã§ããŸãã
fullmatch 颿°
re.fullmatch(pattern, string, flags=0)
ä»»ææååã®å šäœãæ£èŠè¡šçŸã®ãã¿ãŒã³ã«å®å šäžèŽããå Žåã察å¿ããããããªããžã§ã¯ããè¿ããŸããæ£èŠè¡šçŸãã¿ãŒã³ããããããªãå ŽåãNone ãè¿ããŸãã
sub 颿°
re.sub(pattern, repl, string, count=0, flags=0)
ãã®é¢æ°ã¯ãæ€çŽ¢ã ãã§ãªã眮æãè¡ããŸããä»»ææååäžå šäœãèµ°æ»ããæ£èŠè¡šçŸãã¿ãŒã³ã«ãããããŠããç®æå šãŠãä»»æã®æååã«çœ®ãæããŠè¿ããŸãã
æ£èŠè¡šçŸã䜿ã£ãå ·äœäºäŸ
å®éã«ããµã³ãã«ããŒã¿ãããæ£èŠè¡šçŸãã¿ãŒã³ã䜿ã£ãŠãæºåž¯çªå·ãæœåºããŠãããŸãããã
䜿çšãããµã³ãã«ããŒã¿ã¯ã次ã®ãšããã§ãã
å¹åãã 085-1199ãåæµ·éé¿å¯é¡é¶Žå±
æå¹å西 Tel 090-0065-2150
é¶Žå±
ãã 085-1299ãåæµ·éé¿å¯é¡é¶Žå±
æé¶Žå±
西Tel 080-0064-2120
äžä¹
è路簡æéµäŸ¿å±ãã 085-1362 åæµ·éé¿å¯é¡é¶Žå±
æäžä¹
箞ååãTel 090-0064-2160
æå·æ°å¯éµäŸ¿å±ãã 070-0002ãåæµ·éæå·åžæ°å¯äºæ¡ãTel 080-0026-3902
ãã®ãµã³ãã«ããŒã¿ãããæºåž¯çªå·ïŒ090-0065-2150, 080-0064-2120, 090-0064-2160, 080-0026-3902ïŒãæœåºããŸãã
ãµã³ãã«ã³ãŒã
ã䜿çšäŸã
import re
sample_data = "å¹åãã 085-1199ãåæµ·éé¿å¯é¡é¶Žå±
æå¹å西 Tel 090-0065-2150, é¶Žå±
ãã 085-1299ãåæµ·éé¿å¯é¡é¶Žå±
æé¶Žå±
西Tel 080-0064-2120, äžä¹
è路簡æéµäŸ¿å±ãã 085-1362 åæµ·éé¿å¯é¡é¶Žå±
æäžä¹
箞ååãTel 090-0064-2160, æå·æ°å¯éµäŸ¿å±ãã 070-0002ãåæµ·éæå·åžæ°å¯äºæ¡ãTel 080-0026-3902"
mobile_phone_list = re.findall('[0-9]{3}-[0-9]{4}-[0-9]{4}', sample_data)
if mobile_phone_list:
print (mobile_phone_list)
ãå®è¡çµæã
['090-0065-2150', '080-0064-2120', '090-0064-2160', '080-0026-3902']
æ£èŠè¡šçŸã®ã³ãŒã解説
+æŒç®åã䜿ããšãªã¹ããçµåããããšãã§ããŸãã
ãµã³ãã«ã³ãŒãã®è§£èª¬ã§ãã
ïŒè¡ç®ïŒimport reïŒreã¢ãžã¥ãŒã«ã®ã€ã³ããŒããè¡ããŸãã
ïŒè¡ç®ïŒãµã³ãã«æååããŒã¿ããsample_dataã«ä»£å
¥ããŸãã
ïŒè¡ç®ïŒfindall颿°ã䜿çšããæºåž¯çªå·ãæœåºããŸãã
第äžåŒæ°ã«ãæºåž¯é»è©±çªå·ã®æ£èŠè¡šçŸãã¿ãŒã³ã代å
¥ããŸãã
第äºåŒæ°ã«ããµã³ãã«æååããŒã¿ã代å
¥ããŸãã
æ£èŠè¡šçŸãã¿ãŒã³ã«ãããããæååããmobile_phone_listãªã¹ãã«æ ŒçŽãããŸãã
ïŒ~ïŒè¡ç®ïŒmobile_phone_listãªã¹ãã«æ ŒçŽãããæºåž¯çªå·åºåããŸãã
ãŸãšã
ãããã§ãããïŒæ£èŠè¡šçŸã®èãéããŠã¿ããšãæå€ãšç°¡åã ã£ãã®ã§ã¯ãªãã§ããããïŒä»åã¯ãPythonã§æ£èŠè¡šçŸãå®è£
ãããããå¿
èŠãªã¢ãžã¥ãŒã«ã»é¢æ°ã解説ããŸããã
å®è·µã§ããã®ã¢ãžã¥ãŒã«ã䜿ãããªããæååã®æ€çŽ¢ã眮æãé£çµãåå²ããå¹çããè¡ããŸãããïŒ











