Hi,
After googling and reading some pages (among
http://msdn.microsoft.com/en-us/library/bb882639.aspx), i was trying
to get some code running, but I have some problem with the regular
expression.
I would like to look for files that match a name build like this:
defaulName20100223.zip
Therefore I made a regular expression string:
string regExpression = "("+myDefaultName+@"[0-9]{8}\.[zZ][iI][pP])";
where myDefaultName is just a string (needs to be dynamic).
the fileList I will query, will look something like this:
c:\\myDir1\\mydir2\\defaulName20100223.zip
c:\\myDir1\\mydir2\\extra backup defaulName20100223.zip
c:\\myDir1\\mydir2\\defaulName20100224.zip
c:\\myDir1\\mydir2\\defaulName20100225.zip
c:\\myDir1\\mydir2\\copy defaulName20100223.zip
c:\\myDir1\\mydir2\\defaulName20100226.zip
c:\\myDir1\\mydir2\\defaulName20100226-copy.zip
the only files I want in the result set are:
c:\\myDir1\\mydir2\\defaulName20100223.zip
c:\\myDir1\\mydir2\\defaulName20100224.zip
c:\\myDir1\\mydir2\\defaulName20100225.zip
c:\\myDir1\\mydir2\\defaulName20100226.zip
I tried to add Path.DirectorySeparatorChar.ToString() to the front of
the regular expression string, I get it twice.
string myDefaultName = @Path.DirectorySeparatorChar + "defaultName";
string regExpression = "("+myDefaultName+@"[0-9]{8}\.[zZ][iI][pP])";
Gives me "(\\defaultName[0-9]{8}\\.[zZ][iI][pP])" as a regular
expression, while DirectorySeparatorChar is actually just one \
(although I can see it twice again in the result set)
How can I make the regExpr so, that it does what it should do?
I pasted the rest of the code under the message.
Kinds regards and hope you can help me out,
Matthijs
-----------------------------------
private void worker()
{
string startFolder = @"c:\myDir1\mydir2\";
IEnumerable<System.IO.FileInfo> fileList =
GetFiles(startFolder);
string myDefaultName = @Path.DirectorySeparatorChar +
"defaultName";
string regExpression = "("+myDefaultName+@"[0-9]{8}\.[zZ]
[iI][pP])";
System.Text.RegularExpressions.Regex searchTerm =
new System.Text.RegularExpressions.Regex(regExpression);
var queryMatchingFiles =
from file in fileList
where file.Extension == ".zip"
let matches = searchTerm.Matches(file.FullName)
where searchTerm.Matches(file.FullName).Count > 0
select new
{
name = file.FullName,
matches = from
System.Text.RegularExpressions.Match match in matches
select match.Value
};
queryMatchingFiles = queryMatchingFiles; //just for debug
mode, so I can hover over it and check the content
}
static IEnumerable<System.IO.FileInfo> GetFiles(string path)
{
if (!System.IO.Directory.Exists(path))
throw new System.IO.DirectoryNotFoundException();
string[] fileNames = null;
List<System.IO.FileInfo> files = new
List<System.IO.FileInfo>();
fileNames = System.IO.Directory.GetFiles(path, "*.*");
foreach (string name in fileNames)
{
string onlyTheName =
name.Substring(name.LastIndexOf(Path.DirectorySeparatorChar) + 1);
files.Add(new System.IO.FileInfo(name));
}
return files;
}
|
|
0
|
|
|
|
Reply
|
Matthijs
|
2/23/2010 4:10:45 PM |
|
"Matthijs de Z" <matthijsdezwart@gmail.com> wrote in message
news:dce67044-aa6f-40c5-a572-c82cf7dd9991@z25g2000vbb.googlegroups.com...
> I tried to add Path.DirectorySeparatorChar.ToString() to the front of
> the regular expression string, I get it twice.
>
> string myDefaultName = @Path.DirectorySeparatorChar + "defaultName";
> string regExpression = "("+myDefaultName+@"[0-9]{8}\.[zZ][iI][pP])";
>
> Gives me "(\\defaultName[0-9]{8}\\.[zZ][iI][pP])" as a regular
> expression, while DirectorySeparatorChar is actually just one \
> (although I can see it twice again in the result set)
When you say "I get it twice," what are you using to make this
determination? Are you simply looking at the tooltip display in the
debugger? If so, that shows you the "C# view" of the string, with all
characters that need escaping escaped. In other words, if your string
actually contains "C:\Temp" what you'll see in the debug view is "C:\\Temp".
It's basically the IDE showing you exactly what you would need to enter in
code (without the @"" syntax) if you wanted to make this string a constant,
i.e., if you wanted to have
string myString = "C:\\Temp"
in code.
|
|
0
|
|
|
|
Reply
|
Jeff
|
2/23/2010 5:11:57 PM
|
|
Matthijs de Z wrote:
> Hi,
>
> After googling and reading some pages (among
> http://msdn.microsoft.com/en-us/library/bb882639.aspx), i was trying
> to get some code running, but I have some problem with the regular
> expression.
>
> I would like to look for files that match a name build like this:
> defaulName20100223.zip
>
> Therefore I made a regular expression string:
>
> string regExpression = "("+myDefaultName+@"[0-9]{8}\.[zZ][iI][pP])";
>
> [...]
> I tried to add Path.DirectorySeparatorChar.ToString() to the front of
> the regular expression string, I get it twice.
If you are trying to match on the filename only, it seems to me you'd be
better off preprocessing the path before you hand it to the regex. Just
use the Path class, with the GetFileName() method, to obtain only the
filename portion of the path, then match that against the regex.
There's probably a way to handle the directory separator characters in
the regex, but doing so seems overly complicated to me, given that .NET
already has path-specific support for manipulating strings.
Pete
|
|
0
|
|
|
|
Reply
|
Peter
|
2/23/2010 6:06:54 PM
|
|
On 23 feb, 18:11, "Jeff Johnson" <i....@enough.spam> wrote:
> "Matthijs de Z" <matthijsdezw...@gmail.com> wrote in messagenews:dce67044=
-aa6f-40c5-a572-c82cf7dd9991@z25g2000vbb.googlegroups.com...
>
> > I tried to add Path.DirectorySeparatorChar.ToString() to the front of
> > the regular expression string, I get it twice.
>
> > string myDefaultName =3D @Path.DirectorySeparatorChar + "defaultName";
> > string regExpression =3D "("+myDefaultName+@"[0-9]{8}\.[zZ][iI][pP])";
>
> > Gives me "(\\defaultName[0-9]{8}\\.[zZ][iI][pP])" as a regular
> > expression, while DirectorySeparatorChar is actually just one \
> > (although I can see it twice again in the result set)
>
> When you say "I get it twice," what are you using to make this
> determination? Are you simply looking at the tooltip display in the
> debugger?
when I hover over the variable that contains the string
(myDefaultName) I see \\ but when I add the string to a richTextBox I
just see one. So I suppose it's just one \. But still....it doesn't
work...
Any suggestions?
Regards,
Matthijs
If so, that shows you the "C# view" of the string, with all
> characters that need escaping escaped. In other words, if your string
> actually contains "C:\Temp" what you'll see in the debug view is "C:\\Tem=
p".
> It's basically the IDE showing you exactly what you would need to enter i=
n
> code (without the @"" syntax) if you wanted to make this string a constan=
t,
> i.e., if you wanted to have
>
> =A0 =A0 string myString =3D "C:\\Temp"
>
> in code.
|
|
0
|
|
|
|
Reply
|
Matthijs
|
2/23/2010 8:54:02 PM
|
|
On 23 feb, 19:06, Peter Duniho <no.peted.s...@no.nwlink.spam.com>
wrote:
> Matthijs de Z wrote:
> > Hi,
>
> > After googling and reading some pages (among
> >http://msdn.microsoft.com/en-us/library/bb882639.aspx), i was trying
> > to get some code running, but I have some problem with the regular
> > expression.
>
> > I would like to look for files that match a name build like this:
> > defaulName20100223.zip
>
> > Therefore I made a regular expression string:
>
> > string regExpression =3D "("+myDefaultName+@"[0-9]{8}\.[zZ][iI][pP])";
>
> > [...]
> > I tried to add Path.DirectorySeparatorChar.ToString() to the front of
> > the regular expression string, I get it twice.
>
> If you are trying to match on the filename only, it seems to me you'd be
> better off preprocessing the path before you hand it to the regex. =A0Jus=
t
> use the Path class, with the GetFileName() method, to obtain only the
> filename portion of the path, then match that against the regex.
if I do that, i think I will still have a problem with for
instance:'copy defaulName20100223.zip'
How can I make sure I only get the names like defaulName20100223.zip?
regards,
Matthijs
> There's probably a way to handle the directory separator characters in
> the regex, but doing so seems overly complicated to me, given that .NET
> already has path-specific support for manipulating strings.
>
> Pete- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -
|
|
0
|
|
|
|
Reply
|
Matthijs
|
2/23/2010 8:56:33 PM
|
|
"Matthijs de Z" <matthijsdezwart@gmail.com> wrote in message
news:34b09643-66fb-4602-b166-888c0779dd35@n5g2000vbq.googlegroups.com...
>> > I tried to add Path.DirectorySeparatorChar.ToString() to the front of
>> > the regular expression string, I get it twice.
>
>> > string myDefaultName = @Path.DirectorySeparatorChar + "defaultName";
>> > string regExpression = "("+myDefaultName+@"[0-9]{8}\.[zZ][iI][pP])";
>
>> > Gives me "(\\defaultName[0-9]{8}\\.[zZ][iI][pP])" as a regular
>> > expression, while DirectorySeparatorChar is actually just one \
>> > (although I can see it twice again in the result set)
>
>> When you say "I get it twice," what are you using to make this
>> determination? Are you simply looking at the tooltip display in the
>> debugger?
> when I hover over the variable that contains the string
> (myDefaultName) I see \\ but when I add the string to a richTextBox I
> just see one. So I suppose it's just one \.
So then you're seeing exactly what I described.
> But still....it doesn't work...
See my other reply.
|
|
0
|
|
|
|
Reply
|
Jeff
|
2/23/2010 9:06:15 PM
|
|
"Matthijs de Z" <matthijsdezwart@gmail.com> wrote in message
news:d97baa89-7a49-43ff-be51-d2b5c172319a@k11g2000vbe.googlegroups.com...
>> If you are trying to match on the filename only, it seems to me you'd be
>> better off preprocessing the path before you hand it to the regex. Just
>> use the Path class, with the GetFileName() method, to obtain only the
>> filename portion of the path, then match that against the regex.
> if I do that, i think I will still have a problem with for
> instance:'copy defaulName20100223.zip'
> How can I make sure I only get the names like defaulName20100223.zip?
> regards,
Put ^ at the beginning of the regex so that it only matches if the string
starts with the default name.
|
|
0
|
|
|
|
Reply
|
Jeff
|
2/23/2010 9:06:57 PM
|
|
On 23 feb, 22:06, "Jeff Johnson" <i....@enough.spam> wrote:
> "Matthijs de Z" <matthijsdezw...@gmail.com> wrote in messagenews:d97baa89-7a49-43ff-be51-d2b5c172319a@k11g2000vbe.googlegroups.com...
>
> >> If you are trying to match on the filename only, it seems to me you'd be
> >> better off preprocessing the path before you hand it to the regex. Just
> >> use the Path class, with the GetFileName() method, to obtain only the
> >> filename portion of the path, then match that against the regex.
> > if I do that, i think I will still have a problem with for
> > instance:'copy defaulName20100223.zip'
> > How can I make sure I only get the names like defaulName20100223.zip?
> > regards,
>
> Put ^ at the beginning of the regex so that it only matches if the string
> starts with the default name.
when I use
string regExpression = @"^([0-9]{8}\.[zZ][iI][pP])";
It doesn't work, unless I trimdown the filename, cutting of all
directory info. But I need need that actually..
Regards,
Matthijs
|
|
0
|
|
|
|
Reply
|
Matthijs
|
2/23/2010 9:28:14 PM
|
|
"Matthijs de Z" <matthijsdezwart@gmail.com> wrote in message
news:49ba1cee-1446-4728-a13b-59d83e172a60@d27g2000vbl.googlegroups.com...
>> >> If you are trying to match on the filename only, it seems to me you'd
>> >> be
>> >> better off preprocessing the path before you hand it to the regex.
>> >> Just
>> >> use the Path class, with the GetFileName() method, to obtain only the
>> >> filename portion of the path, then match that against the regex.
>> > if I do that, i think I will still have a problem with for
>> > instance:'copy defaulName20100223.zip'
>> > How can I make sure I only get the names like defaulName20100223.zip?
>> > regards,
>>
>> Put ^ at the beginning of the regex so that it only matches if the string
>> starts with the default name.
>
> when I use
> string regExpression = @"^([0-9]{8}\.[zZ][iI][pP])";
>
> It doesn't work, unless I trimdown the filename, cutting of all
> directory info. But I need need that actually..
Well, I was building on what Pete said, and he suggested that you strip of
the directory information. I didn't realize it was important to you.
Does this work:
string regExpression = @".*\\" + myDefaultName +
"(\d{8}\.[zZ][iI][pP])$";
(I replaced [0-9] with \d, since they're the same. Also, you should just
consider setting the case-insensistive option on the regex and test for
"zip" instead of the way you're doing it now, unless case in the rest of the
file name is important--but why would it be?)
|
|
0
|
|
|
|
Reply
|
Jeff
|
2/23/2010 9:48:43 PM
|
|
"Matthijs de Z" <matthijsdezwart@gmail.com> wrote in message
news:49ba1cee-1446-4728-a13b-59d83e172a60@d27g2000vbl.googlegroups.com...
[Reply sent too soon.]
I also wanted to recommend that you go get a utility which will help you
test regular expressions. I like Expresso, which is free.
http://www.ultrapico.com.
|
|
0
|
|
|
|
Reply
|
Jeff
|
2/23/2010 9:50:01 PM
|
|
"Jeff Johnson" <i.get@enough.spam> wrote in message
news:OivT5HNtKHA.5976@TK2MSFTNGP05.phx.gbl...
> case-insensistive
(I can't tell you how many times I backspaced to correct this word and yet I
STILL got it wrong!!)
|
|
0
|
|
|
|
Reply
|
Jeff
|
2/23/2010 9:58:43 PM
|
|
> =A0 =A0 string regExpression =3D @".*\\" + myDefaultName +
> "(\d{8}\.[zZ][iI][pP])$";
adding a @ to the "(\d{8}\.[zZ][iI][pP])$" part was the final thing.
Now it works fine.
thanks all!
|
|
0
|
|
|
|
Reply
|
Matthijs
|
2/24/2010 8:09:42 AM
|
|
> I also wanted to recommend that you go get a utility which will help you
> test regular expressions. I like Expresso, which is free.http://www.ultrapico.com.
and thanks for the espresso. I like a cup of coffee in the morning.
especially this one. Very useful.
|
|
0
|
|
|
|
Reply
|
Matthijs
|
2/24/2010 8:19:21 AM
|
|
|
12 Replies
541 Views
(page loaded in 0.266 seconds)
|