Match the date values in the following formats:
The year should be from 1900 to 2099.
Let's start with matching the valid day values in the given text of numbers.
The day value should satisfy the following conditions:
0[1-9].[12][0-9].3[01].Using the alternation (|), we can accomodate these conditions. For now, let's wrap the pattern with \b so that the pattern works on the number as a whole and not on sub numbers.
38 09 19 2019 35 30 2000 15 24 29 23 20 47 01 22 7 50 32 36 1975 18 45 28 31 40 34 190 37 48 0210 21 43 12 04 03 10 2 26 20100 14 39 27 1925 13 42 41 06 2025 16 46 2005 17 33 5 1980 25 49 44 1901 08 11
| match | position | $1 |
|---|---|---|
| 09 | 3 | 09 |
| 19 | 6 | 19 |
| 30 | 17 | 30 |
| 15 | 25 | 15 |
| 24 | 28 | 24 |
| 29 | 31 | 29 |
| 23 | 34 | 23 |
| 20 | 37 | 20 |
| 01 | 43 | 01 |
| 22 | 46 | 22 |
| 18 | 65 | 18 |
| 28 | 71 | 28 |
| 31 | 74 | 31 |
| 21 | 98 | 21 |
| 12 | 104 | 12 |
| 04 | 107 | 04 |
| 03 | 110 | 03 |
| 10 | 113 | 10 |
| 26 | 118 | 26 |
| 14 | 127 | 14 |
| 27 | 133 | 27 |
| 13 | 141 | 13 |
| 06 | 150 | 06 |
| 16 | 158 | 16 |
| 17 | 169 | 17 |
| 25 | 182 | 25 |
| 08 | 196 | 08 |
| 11 | 199 | 11 |
The month value should satisfy the following conditions:
0[1-9].1[012].38 09 19 2019 35 30 2000 15 24 29 23 20 47 01 22 7 50 32 36 1975 18 45 28 31 40 34 190 37 48 0210 21 43 12 04 03 10 2 26 20100 14 39 27 1925 13 42 41 06 2025 16 46 2005 17 33 5 1980 25 49 44 1901 08 11
| match | position | $1 |
|---|---|---|
| 09 | 3 | 09 |
| 01 | 43 | 01 |
| 12 | 104 | 12 |
| 04 | 107 | 04 |
| 03 | 110 | 03 |
| 10 | 113 | 10 |
| 06 | 150 | 06 |
| 08 | 196 | 08 |
| 11 | 199 | 11 |
The year value should satisfy the following conditions:
(19|20).\d{2}.38 09 19 2019 35 30 2000 15 24 29 23 20 47 01 22 7 50 32 36 1975 18 45 28 31 40 34 190 37 48 0210 21 43 12 04 03 10 2 26 20100 14 39 27 1925 13 42 41 06 2025 16 46 2005 17 33 5 1980 25 49 44 1901 08 11
| match | position | $1 |
|---|---|---|
| 2019 | 9 | 20 |
| 2000 | 20 | 20 |
| 1975 | 60 | 19 |
| 1925 | 136 | 19 |
| 2025 | 153 | 20 |
| 2005 | 164 | 20 |
| 1980 | 177 | 19 |
| 1901 | 191 | 19 |
This input text contains both correct and incorect date values.
Let's combine the above with pattern for the delimiters [-/.].
But this pattern also matches incorrect values such as 18.03-2027.
12-12-2010 24-02-20100 14-15-2005 30-11-1950 16-02-201 32-12-2010 19/03/1910 16.10.2005 14.13.1985 09-03-2074 29-06-2078 19-09-1855 28-02-2090 06-05-1939 07.02.1934 13.12..1983 18.03-2027 23.12.1904 29-04-0901 10/05/2020 30/01.1993 16.01.2011 04-010-2063 03-02-2043 13-12-3015 05-06/2047 17-11-2000 30-12-2083 22-04-2005 34-03-1986 29-01-1971 05-02-1944 02-03-4079 16-02-2021
| match | position | $1 | $2 | $3 |
|---|---|---|---|---|
| 12-12-2010 | 0 | 12 | 12 | 20 |
| 30-11-1950 | 34 | 30 | 11 | 19 |
| 19/03/1910 | 66 | 19 | 03 | 19 |
| 16.10.2005 | 77 | 16 | 10 | 20 |
| 09-03-2074 | 99 | 09 | 03 | 20 |
| 29-06-2078 | 110 | 29 | 06 | 20 |
| 28-02-2090 | 132 | 28 | 02 | 20 |
| 06-05-1939 | 143 | 06 | 05 | 19 |
| 07.02.1934 | 154 | 07 | 02 | 19 |
| 18.03-2027 | 177 | 18 | 03 | 20 |
| 23.12.1904 | 188 | 23 | 12 | 19 |
| 10/05/2020 | 210 | 10 | 05 | 20 |
| 30/01.1993 | 221 | 30 | 01 | 19 |
| 16.01.2011 | 232 | 16 | 01 | 20 |
| 03-02-2043 | 255 | 03 | 02 | 20 |
| 05-06/2047 | 277 | 05 | 06 | 20 |
| 17-11-2000 | 288 | 17 | 11 | 20 |
| 30-12-2083 | 299 | 30 | 12 | 20 |
| 22-04-2005 | 310 | 22 | 04 | 20 |
| 29-01-1971 | 332 | 29 | 01 | 19 |
| 05-02-1944 | 343 | 05 | 02 | 19 |
| 16-02-2021 | 365 | 16 | 02 | 20 |
12-12-2010 24-02-20100 14-15-2005 30-11-1950 16-02-201 32-12-2010 19/03/1910 16.10.2005 14.13.1985 09-03-2074 29-06-2078 19-09-1855 28-02-2090 06-05-1939 07.02.1934 13.12..1983 18.03-2027 23.12.1904 29-04-0901 10/05/2020 30/01.1993 16.01.2011 04-010-2063 03-02-2043 13-12-3015 05-06/2047 17-11-2000 30-12-2083 22-04-2005 34-03-1986 29-01-1971 05-02-1944 02-03-4079 16-02-2021
| match | position | $1 | $2 | $3 | $4 |
|---|---|---|---|---|---|
| 12-12-2010 | 0 | 12 | - | 12 | 20 |
| 30-11-1950 | 34 | 30 | - | 11 | 19 |
| 19/03/1910 | 66 | 19 | / | 03 | 19 |
| 16.10.2005 | 77 | 16 | . | 10 | 20 |
| 09-03-2074 | 99 | 09 | - | 03 | 20 |
| 29-06-2078 | 110 | 29 | - | 06 | 20 |
| 28-02-2090 | 132 | 28 | - | 02 | 20 |
| 06-05-1939 | 143 | 06 | - | 05 | 19 |
| 07.02.1934 | 154 | 07 | . | 02 | 19 |
| 23.12.1904 | 188 | 23 | . | 12 | 19 |
| 10/05/2020 | 210 | 10 | / | 05 | 20 |
| 16.01.2011 | 232 | 16 | . | 01 | 20 |
| 03-02-2043 | 255 | 03 | - | 02 | 20 |
| 17-11-2000 | 288 | 17 | - | 11 | 20 |
| 30-12-2083 | 299 | 30 | - | 12 | 20 |
| 22-04-2005 | 310 | 22 | - | 04 | 20 |
| 29-01-1971 | 332 | 29 | - | 01 | 19 |
| 05-02-1944 | 343 | 05 | - | 02 | 19 |
| 16-02-2021 | 365 | 16 | - | 02 | 20 |
Except for the delimiters group ([-/.]), other groups can be made into non-capturing for a slight performance improvement. You can use (?:) to make it non-capturing groups and drop them from the results. You'll see just $1 instead of $1, $2, $3 and $4. The backreference \1 should be used as there is only one capturing group (for delimiters).
12-12-2010 24-02-20100 14-15-2005 30-11-1950 16-02-201 32-12-2010 19/03/1910 16.10.2005 14.13.1985 09-03-2074 29-06-2078 19-09-1855 28-02-2090 06-05-1939 07.02.1934 13.12..1983 18.03-2027 23.12.1904 29-04-0901 10/05/2020 30/01.1993 16.01.2011 04-010-2063 03-02-2043 13-12-3015 05-06/2047 17-11-2000 30-12-2083 22-04-2005 34-03-1986 29-01-1971 05-02-1944 02-03-4079 16-02-2021
| match | position | $1 |
|---|---|---|
| 12-12-2010 | 0 | - |
| 30-11-1950 | 34 | - |
| 19/03/1910 | 66 | / |
| 16.10.2005 | 77 | . |
| 09-03-2074 | 99 | - |
| 29-06-2078 | 110 | - |
| 28-02-2090 | 132 | - |
| 06-05-1939 | 143 | - |
| 07.02.1934 | 154 | . |
| 23.12.1904 | 188 | . |
| 10/05/2020 | 210 | / |
| 16.01.2011 | 232 | . |
| 03-02-2043 | 255 | - |
| 17-11-2000 | 288 | - |
| 30-12-2083 | 299 | - |
| 22-04-2005 | 310 | - |
| 29-01-1971 | 332 | - |
| 05-02-1944 | 343 | - |
| 16-02-2021 | 365 | - |