通配符匹配串 -- 编程驿站-- 编程爱好者博客

Wildcard Characters

Time Limit: 1000ms, Special Time Limit:2500ms, Memory Limit:32768KB

Problem description

In computer (software) technology, a wildcard character can be used to substitute for any other character or characters in a string.
The asterisk (*) usually substitutes as a wildcard character for any zero or more characters, and the question mark (?) usually substitutes as a wildcard character for any one character, as in the CP/M, DOS, Microsoft Windows and POSIX (Unix) shells. (In Unix this is referred to as glob expansion.)

For example: a string "wildcard" can be matched by expression "*", "**", "????????", "?ildcar?", "*card", "?ild*d", etc.

Input

The first line of input contains a non-negtive integer t(1 <= t <= 1000), then followed t lines, each contains a single word(composed by alphabet characters, digits or underlines "_", and no longer than 10 characters), the next line contains a non-negtive integer n(1 <= n <= 30), then followed n lines, each line is an expression, which my contain some wildcard characters.

Output

For each expression, you should output the number of words that matches this expression.(the match is case-insensitive).

Sample Input

5
perl
public
python
pearl
private
4
p*
??????
*l
*i*

Sample Output

解题思路：

这题开始编写的时候，我很快陷入那种填补式代码的困境，就是不断的调试，发现一个错误然后纠正下代码，又调试，又纠正，最后代码基本上失去管理。。。于是我索性把那些代码全部删除了，整理下思路并在纸上表示出。终于AC了。。呵呵，思路是最重要的。另外平时练习ACM也应该注意代码的可读性和可复用性。

首先含通配符的串归结为两类：以 * 开头，以?cc 即不含* 开头的子串开头。对?cc 这一类，我们先找到一个匹配的串，此后以该串后的开始位置匹配后面的串。统一以*开始匹配，我们选出两个*之间（也含*到结尾）的串进行匹配，匹配成功则记录该串的值；否则就匹配失败。我们只记录最后的这种串。扫描完表达式串后，如果原串匹配完，或者表达式串的最后一个字符是*则匹配成功，否则在原串的最后位置匹配保存的串。原理： * 后面的串可以是原串后面的任意匹配串。。

如：原串 str 为： agksdjfDFSgskjfkKKK 表达式 exp 串为： *dj*df*??jfk*

匹配过程：扫描表达式串 exp 第一位置为 * ，

找到子串 dj　　匹配位置：　i=4 ,　保存到字符串 lastSubstr = "dj"

继续扫描 exp (从 dj后面开始扫）,　找到子串 df(DF) ,忽略大小写比较　lastSubstr="df"

……　最后lastSubstr="??jfk", 因为 exp 的最后字符为　＊　，匹配成功；

另：　如果 str = "aksfjkskdfjkslSKD" exp="aks*jk*Skd"

首先 aks = aks ，exp="*jk*Skd"( 前面的可以不管了)

lastSubstr = jk, 最后 lastSubstr = "Skd". 在 str 的 skdfjkslSKD　匹配，　str 剩余串为

dfjkslSKD, 因为　exp　的最后字符不是 *, 则 lastSubstr　必须与 str 的剩余串尾部匹配这里是 "Skd"="SKD", 匹配成功。

／／／／／／／／　代码　／／／／／／／／／／

#include <iostream>
#include <string>
using namespace std;
const int CSTR_LENGTH = 20;
char gStr[1000][CSTR_LENGTH];

/* 串匹配辅助函数（在 cstr 里查找 str 子串）
* 参数： i    cstr 的起始匹配下标（匹配成功则返回起始位置）
*        it   用于返回子串在 cstr 中的最后位置
*   str 待匹配子串串,可能含有通配符 ?
*        cstr 原串
* 返回值：匹配成功则返回true,同时i,it记录匹配位置, 否则 false
**/
bool searchMatch(unsigned int& i, unsigned int& it,
        const string& str, const char cstr[]){
    unsigned int si, j;
    for(si=i; cstr[si]!=NULL; si++){
        for(it=si,j=0; j<str.size() && cstr[it]!=NULL &&
            (str[j]=='?' || str[j]==cstr[it]); j++)
            it++;
        if(j>=str.size()){
            i=si;
            return true;
        }
    }
    return false;
}

/* 通配符匹配串函数(长度不限)
* 参数： str[], 待匹配的C风格字符串（以NULL结尾）
*       exp[], 含有通配符 ? * 的C风格字符串
*   isCase : false 忽略大小写匹配， true 区分大小写
* 返回值: 如果 exp 可以匹配 str 则返回 true ,否则 false
* 注意：该函数在忽略大小写比较时，会改变原串
**/
bool isMatch(char str[], char exp[], bool isCase){
    /// 是否忽略大小写匹配，是，则全部转为小写形式
    if(!isCase){
        char *p;
        for(p=str; *p!=NULL; p++){
            if(isupper(*p))
                *p = tolower(*p);
        }
        for(p=exp; *p!=NULL; p++)
            if(isupper(*p))
                *p = tolower(*p);
    }
    /// 空串处理，空串或者 “**...*”可以匹配空串
    if(str[0]==NULL){
        int j;
        for(j=0; exp[j] && exp[j]=='*';)
            j++;
        return exp[j]==NULL;
    }else if(exp[0]==NULL)
        return false;
    /// 如果 exp 不以 * 开始，则先匹配开始部分
    unsigned i, j, it, jt;
    for(i=j=0; str[i]!=NULL&&exp[j]!=NULL&&exp[j]!='*'; i++,j++){
        if(exp[j]!='?' && str[i]!=exp[j])
            return false;
    }
    if(str[i]==NULL){
        for(; exp[j] && exp[j]=='*';)
            j++;
        return exp[j]==NULL;
    }else if(exp[j]==NULL)
        return false;
    /// 扫描以 * 开始的串，进行匹配
    string lastSubstr;
    for(jt=j; exp[jt]!=NULL ; i=it,j=jt){
        for(; exp[j]!=NULL && exp[j]=='*';)
            j++;
        if(exp[j]==NULL)
            break;
        for(jt=j; exp[jt]!=NULL && exp[jt]!='*';)
            jt++;
        string tmp(&exp[j],&exp[jt]);
        if(!searchMatch(i,it,tmp,str))
            return false;
        lastSubstr = tmp;
    }
    /// 最后一个子串（exp中的）处理。
    if(exp[j-1]=='*')
        return true;
    for(i=strlen(str)-lastSubstr.length(),j=0; j<lastSubstr.length(); i++,j++)
        if(lastSubstr[j]!='?' && str[i]!=lastSubstr[j])
            return false;
    return true;
}

int main(){
    int n, m;
    char str[CSTR_LENGTH];
    cin>>n;
    for(int i=0; i<n; i++)
        cin>>gStr[i];
    cin>>m;
    int j, match;
    for(; m--; ){
        cin>>str;
        for(match=j=0; j<n; j++)
            if(isMatch(gStr[j],str,false))
                match++;
        cout<<match<<endl;
    }
    return 0;
}

博客介绍

正文

通配符匹配串2007-09-07 22:48:00

评论