Python 正则表达式 Re 模块

创作人 Leo


编辑时间 Wed Jan 1,2020 at 10:13


Python 正则表达式 Re 模块

正则表达式使用 re 模块

构建模式对象

p = re.compile(p, re.IGNORECASE|re.MULTILINE)

构建一个pattern对象,并指定模式忽略大小写和开启多行模式

MATCH 和 SEARCH

match 匹配

匹配即为模式串必须完全匹配,

search 搜索

指在目标字符串中搜索符合这个正则表达式的子串(与php grep_match 相同)

例:

# -*- coding: utf-8 -*-
'''
正则表达式
'''
 
import re
 
target_str = '<img src="http://maps.google.cn/maps/api/staticmap?zoom=12&size=270x180&markers=icon:http://static.qyer.com/images/place5/icon_mapno_big.png|35.715038,139.796799&sensor=false">'
reg_str = "http://maps.google.cn.*|(.*?),(.*?)&sensor=false"
 
pattern = re.compile(reg_str, re.IGNORECASE|re.MULTILINE)
 
print pattern
 
searchobj = pattern.search(target_str)
if searchobj :
    print searchobj
    print searchobj.group()
    print searchobj.groups()
else:
    print 'no match'
 
matchobj = pattern.match(target_str)
#print matchobj # None
if matchobj:
    print matchobj
else:
    print 'no match'
 
target_str2 = 'http://maps.google.cn/maps/api/staticmap?zoom=12&size=270x180&markers=icon:http://static.qyer.com/images/place5/icon_mapno_big.png|35.715038,139.796799&sensor=false'
pattern2 = re.compile(reg_str)
matchobj2 = pattern2.match(target_str2)
if matchobj2 :
    print matchobj2
    print matchobj2.group()
    print matchobj2.groups()
 
else:
    print 'no match'

注意:

红色代码段使用match匹配是不能成功地,需要使用search方法

这也就是match和search的不同之处,match是匹配全串,适合电话号码,邮箱验证等功能;而search适合在html文档中搜索需要用的字符串

FINDALL 和 FINDITER

findall

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

  1. 返回一个列表

  2. 当有分组时,返回匹配到的分组

  3. 当分组为多个时,没次匹配的多个分组为一个元祖(tuple),返回一个包含所有元祖的列表 list

finditer

Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result unless they touch the beginning of another match.

返回MatchObject 类型迭代器,这个就是php中的 preg_match_all

例:

# -*- coding: utf-8 -*-
'''
正则表达式
'''
 
import re
 
target_str = '
    <img src="http://maps.google.cn/maps/api/staticmap?zoom=12&size=270x180&markers=icon:http://static.qyer.com/images/place5/icon_mapno_big.png|35.715038,139.796799&sensor=false">
    <img src="http://maps.google.cn/maps/api/staticmap?zoom=12&size=270x180&markers=icon:http://static.qyer.com/images/place5/icon_mapno_big.png|35.7138,13.796799&sensor=false">
    '
reg_str = "http://maps.google.cn.*?|(.*?),(.*?)&sensor=false"
 
f_iter = re.finditer(reg_str, target_str, re.I)
print f_iter
for item in f_iter :
    print item.group()
    print item.groups()
    pass
 
'''
<callable-iterator object at 0x0000000002184588>
http://maps.google.cn/maps/api/staticmap?zoom=12&size=270x180&markers=icon:http://static.qyer.com/images/place5/icon_mapno_big.png|35.715038,139.796799&sensor=false
('35.715038', '139.796799')
http://maps.google.cn/maps/api/staticmap?zoom=12&size=270x180&markers=icon:http://static.qyer.com/images/place5/icon_mapno_big.png|35.7138,13.796799&sensor=false
('35.7138', '13.796799')
'''
 
f_all = re.findall(reg_str, target_str, re.I)
print f_all
 
'''
[('35.715038', '139.796799'), ('35.7138', '13.796799')]
'''



阅读:953
搜索
  • Linux 高性能网络编程库 Libevent 简介和示例 2557
  • Mac系统编译PHP7【20190929更新】 2290
  • zksync 和 layer2 2190
  • web rtc 学习笔记(一) 2172
  • Hadoop 高可用集群搭建 (Hadoop HA) 2160
  • Hadoop Map Reduce 案例:好友推荐 2104
  • react 学习笔记(一) 2065
  • Linux 常用命令 2056
  • 小白鼠问题 2040
  • 安徽黄山游 2038
简介
不定期分享软件开发经验,生活经验