datefinder - extract dates from text¶
A python module for locating dates inside text. Use this package to extract all sorts of date like strings from a document and turn them into datetime objects.
This module finds the likely datetime strings and then uses the dateparser package to convert to the datetime object.
Installation¶
pip install datefinder
How to Use¶
-
datefinder.
find_dates
(text, source=False, index=False, strict=False, base_date=None)¶ Extract datetime strings from text
Parameters: - text (str|unicode) – A string that contains one or more natural language or literal datetime strings
- source (boolean) – Return the original string segment
- index (boolean) – Return the indices where the datetime string was located in text
- strict (boolean) – Only return datetimes with complete date information. For example: July 2016 of Monday will not return datetimes. May 16, 2015 will return datetimes.
- base_date (datetime) – Set a default base datetime when parsing incomplete dates
Returns: Returns a generator that produces
datetime.datetime
objects, or a tuple with the source text and index, if requested
In [1]: string_with_dates = """
...: ...
...: entries are due by January 4th, 2017 at 8:00pm
...: ...
...: created 01/15/2005 by ACME Inc. and associates.
...: ...
...: """
In [2]: import datefinder
In [3]: matches = datefinder.find_dates(string_with_dates)
In [4]: for match in matches:
...: print match
...:
2017-01-04 20:00:00
2005-01-15 00:00:00