Written on 2026-04-06
Tags: bash OpenData PostgreSQL

Downloading Easter dates

The US Census site provides a downloadable file containing the dates of Easter holidays from 1600 to 2099. We are interested in the ASCII text file, as it can be easily parsed using bash utilities. The file consists of dates in the MM DD YYYY format, which we will need to convert to the ISO-8601 format.

We should also need to keep in mind that the dataset includes only Easter Sunday. From this date, we can derive Easter Monday and so forth.

Although a dataset spanning 500 years of Easter dates may be useful, we will limit it to the Unix time and up to the next year. If we need to expand the range later, we can regenerate the dataset.

This page describes downloading and modifying the dataset using bash.

Extract
Transform
Load

Extracting the file is straightforward, as all data is contained in a single file. Simply run curl to download it.

#! /usr/bin/env bash
declare -r url="https://www2.census.gov/software/x-13arima-seats/win-genhol/download/easter500.txt"
declare -r download_file=$(basename "${url}")

# Download data
curl --silent \
     --show-error \
     --output ${download_file} \
     ${url}

A quick look at the downloaded file shows that the dataset contains dates from 1600 to 2099, with a total of 500 entries.

ubuntu@ubuntu:~/easter-dates$ head --lines 3 easter500.txt
      2    1600
     22    1601
      7    1602
ubuntu@ubuntu:~/easter-dates$ tail --lines 3 easter500.txt
     31    2097
     20    2098
     12    2099
ubuntu@ubuntu:~/easter-dates$ wc --lines easter500.txt
easter500.txt

Let’s look at the downloaded file:

ubuntu@ubuntu:~/easter-dates$ file easter500.txt
easter500.txt: ASCII text, with CRLF line terminators

The file is an ASCII text file with CRLF line terminators. For further processing in bash we need to convert CRLF to LF. To do so, we will use the dos2unix utility.

ubuntu@ubuntu:~/easter-dates$ dos2unix easter500.txt
dos2unix: converting file easter500.txt to Unix format...
ubuntu@ubuntu:~/easter-dates$ file easter500.txt
download/easter500.txt: ASCII text

Next step is to convert the dates to ISO-8601 format (%Y-%m-%d) and filter out redundant years. The awk utility is well suited for this task. Below is a small awk script:

get-easter-dates.awk

#! /usr/bin/awk --file
BEGIN {
  # Define the begin year as the year 1970
  begin_year=1970
  # Define the end year as the next year from now
  end_year=strftime("%Y") + 1;
}
{
  # Define the 1st column as month values
  month=$1;
  # Define the 2nd column as day values 
  day=$2;
  # Define the 3rd column as the year values
  year=$3;

  # Format the Easter Sunday date in ISO-8601
  sunday=strftime("%Y-%m-%d", mktime(year" "month" "day" 0 0 0"));
  # Format the Easter Monday date in ISO-8601
  monday=strftime("%Y-%m-%d", mktime(year" "month" "(day+1)" 0 0 0"));

  # Filter the dates and print the results
  if (year >= begin_year && year <= end_year) {
    print sunday;
    print monday;
  }
}
END {
}

Putting it all together:

#! /usr/bin/env bash

# Transform and validate dates
awk --file get-easter-dates.awk \
  easter-500.txt > easter-dates.$(date +"%Y").txt

The final step in modifying the data is to review the contents of the output file.

ubuntu@ubuntu:~/easter-dates$ head --lines 3 easter-dates.$(date +"%Y").txt
1970-03-29
1970-03-30
1971-04-11
ubuntu@ubuntu:~/easter-dates$ tail --lines 3 easter-dates.$(date +"%Y").txt
2026-04-06
2027-03-28
2027-03-29
ubuntu@ubuntu:~/easter-dates$ wc --lines easter-dates.$(date +"%Y").txt
116 final.txt
ubuntu@ubuntu:~/easter-dates$ echo "(2027 + 1 - 1970) * 2" | bc --mathlib
116

Everything looks good. On to the next step.

The final storage location for the data will be a PostgreSQL database. There is an opendata database where we will store the table.

CREATE TABLE IF NOT EXISTS easter_dates (
  holiday_date DATE NOT NULL
);
CREATE UNIQUE INDEX IF NOT EXISTS ON easter_dates (holiday_date);

For more detailed information about the CREATE TABLE and CREATE INDEX syntax, please consult the PostgreSQL documentation. For this task, I am using version 18.3.

Additionally, there is a useful blog post about naming conventions for the date and timestamp columns: Naming Conventions: Date and Timestamp Columns. Well worth spending two minutes reading it.

After creating table, we use PostgreSQL \COPY command to load the dates directly into the database.

#! /usr/bin/env bash
declare -r final_file="easter-dates.$(date +"%Y").txt"

echo "\\COPY easter_dates (holiday_date) FROM '${final_file}';" \
  | psql --host postgres.pi --user opendata opendata

INFO

The connection string for the psql may differ for your setup. I extensively use .pgpass file for the database connections, which is why no password is specified in the parameters.

Finally, let’s explore the data and, at the same time, validate the values.

opendata=> SELECT
  COUNT(*)
FROM 
  easter_dates
WHERE
  NOT (extract(isodow from holiday_date) IN (1,7));
 count
-------
     0
(1 row)

opendata=> SELECT
  COUNT(*) AS total_num,
  MIN(holiday_date) AS first_date,
  MAX(holiday_date) AS last_date,
  MIN(to_char(holiday_date, 'MM-DD')) AS earliest_holiday_in_year,
  MAX(to_char(holiday_date, 'MM-DD')) AS latest_holiday_in_year
FROM
  easter_dates;
 total_num | first_date | last_date  | earliest_holiday_in_year | latest_holiday_in_year
-----------+------------+------------+--------------------------+------------------------
       116 | 1970-03-29 | 2027-03-29 | 03-23                    | 04-25
(1 row)

This workflow demonstrates how to efficiently retrieve, transform, and load a dataset using standard Unix tools and PostgreSQL. Starting from a raw ASCII file, we converted the data into a structured ISO 8601 format, filtered and validated it, and finally stored it in a relational database for further use. This approach highlights the simplicity and power of combining shell utilities with SQL for practical data processing tasks.