Match the date values in the following formats:
The year should be from 1900 to 2099.
Let's start with matching the valid day values in the given text of numbers.
The day value should satisfy the following conditions:
0[1-9]
.[12][0-9]
.3[01]
.Using the alternation (|), we can accomodate these conditions. For now, let's wrap the pattern with \b
so that the pattern works on the number as a whole and not on sub numbers.
38 09 19 2019 35 30 2000 15 24 29 23 20 47 01 22 7 50 32 36 1975 18 45 28 31 40 34 190 37 48 0210 21 43 12 04 03 10 2 26 20100 14 39 27 1925 13 42 41 06 2025 16 46 2005 17 33 5 1980 25 49 44 1901 08 11
match | position | $1 |
---|---|---|
09 | 3 | 09 |
19 | 6 | 19 |
30 | 17 | 30 |
15 | 25 | 15 |
24 | 28 | 24 |
29 | 31 | 29 |
23 | 34 | 23 |
20 | 37 | 20 |
01 | 43 | 01 |
22 | 46 | 22 |
18 | 65 | 18 |
28 | 71 | 28 |
31 | 74 | 31 |
21 | 98 | 21 |
12 | 104 | 12 |
04 | 107 | 04 |
03 | 110 | 03 |
10 | 113 | 10 |
26 | 118 | 26 |
14 | 127 | 14 |
27 | 133 | 27 |
13 | 141 | 13 |
06 | 150 | 06 |
16 | 158 | 16 |
17 | 169 | 17 |
25 | 182 | 25 |
08 | 196 | 08 |
11 | 199 | 11 |
The month value should satisfy the following conditions:
0[1-9]
.1[012]
.38 09 19 2019 35 30 2000 15 24 29 23 20 47 01 22 7 50 32 36 1975 18 45 28 31 40 34 190 37 48 0210 21 43 12 04 03 10 2 26 20100 14 39 27 1925 13 42 41 06 2025 16 46 2005 17 33 5 1980 25 49 44 1901 08 11
match | position | $1 |
---|---|---|
09 | 3 | 09 |
01 | 43 | 01 |
12 | 104 | 12 |
04 | 107 | 04 |
03 | 110 | 03 |
10 | 113 | 10 |
06 | 150 | 06 |
08 | 196 | 08 |
11 | 199 | 11 |
The year value should satisfy the following conditions:
(19|20)
.\d{2}
.38 09 19 2019 35 30 2000 15 24 29 23 20 47 01 22 7 50 32 36 1975 18 45 28 31 40 34 190 37 48 0210 21 43 12 04 03 10 2 26 20100 14 39 27 1925 13 42 41 06 2025 16 46 2005 17 33 5 1980 25 49 44 1901 08 11
match | position | $1 |
---|---|---|
2019 | 9 | 20 |
2000 | 20 | 20 |
1975 | 60 | 19 |
1925 | 136 | 19 |
2025 | 153 | 20 |
2005 | 164 | 20 |
1980 | 177 | 19 |
1901 | 191 | 19 |
This input text contains both correct and incorect date values.
Let's combine the above with pattern for the delimiters [-/.]
.
But this pattern also matches incorrect values such as 18.03-2027.
12-12-2010 24-02-20100 14-15-2005 30-11-1950 16-02-201 32-12-2010 19/03/1910 16.10.2005 14.13.1985 09-03-2074 29-06-2078 19-09-1855 28-02-2090 06-05-1939 07.02.1934 13.12..1983 18.03-2027 23.12.1904 29-04-0901 10/05/2020 30/01.1993 16.01.2011 04-010-2063 03-02-2043 13-12-3015 05-06/2047 17-11-2000 30-12-2083 22-04-2005 34-03-1986 29-01-1971 05-02-1944 02-03-4079 16-02-2021
match | position | $1 | $2 | $3 |
---|---|---|---|---|
12-12-2010 | 0 | 12 | 12 | 20 |
30-11-1950 | 34 | 30 | 11 | 19 |
19/03/1910 | 66 | 19 | 03 | 19 |
16.10.2005 | 77 | 16 | 10 | 20 |
09-03-2074 | 99 | 09 | 03 | 20 |
29-06-2078 | 110 | 29 | 06 | 20 |
28-02-2090 | 132 | 28 | 02 | 20 |
06-05-1939 | 143 | 06 | 05 | 19 |
07.02.1934 | 154 | 07 | 02 | 19 |
18.03-2027 | 177 | 18 | 03 | 20 |
23.12.1904 | 188 | 23 | 12 | 19 |
10/05/2020 | 210 | 10 | 05 | 20 |
30/01.1993 | 221 | 30 | 01 | 19 |
16.01.2011 | 232 | 16 | 01 | 20 |
03-02-2043 | 255 | 03 | 02 | 20 |
05-06/2047 | 277 | 05 | 06 | 20 |
17-11-2000 | 288 | 17 | 11 | 20 |
30-12-2083 | 299 | 30 | 12 | 20 |
22-04-2005 | 310 | 22 | 04 | 20 |
29-01-1971 | 332 | 29 | 01 | 19 |
05-02-1944 | 343 | 05 | 02 | 19 |
16-02-2021 | 365 | 16 | 02 | 20 |
12-12-2010 24-02-20100 14-15-2005 30-11-1950 16-02-201 32-12-2010 19/03/1910 16.10.2005 14.13.1985 09-03-2074 29-06-2078 19-09-1855 28-02-2090 06-05-1939 07.02.1934 13.12..1983 18.03-2027 23.12.1904 29-04-0901 10/05/2020 30/01.1993 16.01.2011 04-010-2063 03-02-2043 13-12-3015 05-06/2047 17-11-2000 30-12-2083 22-04-2005 34-03-1986 29-01-1971 05-02-1944 02-03-4079 16-02-2021
match | position | $1 | $2 | $3 | $4 |
---|---|---|---|---|---|
12-12-2010 | 0 | 12 | - | 12 | 20 |
30-11-1950 | 34 | 30 | - | 11 | 19 |
19/03/1910 | 66 | 19 | / | 03 | 19 |
16.10.2005 | 77 | 16 | . | 10 | 20 |
09-03-2074 | 99 | 09 | - | 03 | 20 |
29-06-2078 | 110 | 29 | - | 06 | 20 |
28-02-2090 | 132 | 28 | - | 02 | 20 |
06-05-1939 | 143 | 06 | - | 05 | 19 |
07.02.1934 | 154 | 07 | . | 02 | 19 |
23.12.1904 | 188 | 23 | . | 12 | 19 |
10/05/2020 | 210 | 10 | / | 05 | 20 |
16.01.2011 | 232 | 16 | . | 01 | 20 |
03-02-2043 | 255 | 03 | - | 02 | 20 |
17-11-2000 | 288 | 17 | - | 11 | 20 |
30-12-2083 | 299 | 30 | - | 12 | 20 |
22-04-2005 | 310 | 22 | - | 04 | 20 |
29-01-1971 | 332 | 29 | - | 01 | 19 |
05-02-1944 | 343 | 05 | - | 02 | 19 |
16-02-2021 | 365 | 16 | - | 02 | 20 |
Except for the delimiters group ([-/.])
, other groups can be made into non-capturing for a slight performance improvement. You can use (?:)
to make it non-capturing groups and drop them from the results. You'll see just $1 instead of $1, $2, $3 and $4. The backreference \1
should be used as there is only one capturing group (for delimiters).
12-12-2010 24-02-20100 14-15-2005 30-11-1950 16-02-201 32-12-2010 19/03/1910 16.10.2005 14.13.1985 09-03-2074 29-06-2078 19-09-1855 28-02-2090 06-05-1939 07.02.1934 13.12..1983 18.03-2027 23.12.1904 29-04-0901 10/05/2020 30/01.1993 16.01.2011 04-010-2063 03-02-2043 13-12-3015 05-06/2047 17-11-2000 30-12-2083 22-04-2005 34-03-1986 29-01-1971 05-02-1944 02-03-4079 16-02-2021
match | position | $1 |
---|---|---|
12-12-2010 | 0 | - |
30-11-1950 | 34 | - |
19/03/1910 | 66 | / |
16.10.2005 | 77 | . |
09-03-2074 | 99 | - |
29-06-2078 | 110 | - |
28-02-2090 | 132 | - |
06-05-1939 | 143 | - |
07.02.1934 | 154 | . |
23.12.1904 | 188 | . |
10/05/2020 | 210 | / |
16.01.2011 | 232 | . |
03-02-2043 | 255 | - |
17-11-2000 | 288 | - |
30-12-2083 | 299 | - |
22-04-2005 | 310 | - |
29-01-1971 | 332 | - |
05-02-1944 | 343 | - |
16-02-2021 | 365 | - |