Ah, timesheets. The Lex Luthor of every software engineer’s existence. If you’re like me, you’d rather debug a race condition at 3 a.m. than spend even five minutes documenting what you did all day. But alas, as freelancers or full-time employees, we’re often stuck with this tedious chore.
This year, I hit my breaking point. After a whirlwind of projects—some canceled, some pivoted, and others shelved indefinitely—I found myself staring down the barrel of a year-end timesheet deadline. The thought of manually reconstructing my entire year’s work made me want to scream. So, I did what any self-respecting engineer would do: I decided to automate the problem away.
This is the story of how I turned a day of dread into a coding adventure. Buckle up—it’s going to be a lazy, efficient ride.
The Problem: Timesheets Are the Worst
Let’s set the scene:
The Task: Document every hour spent on every task for the entire year.
The Obstacle: My memory is about as reliable as a flaky integration test.
The Deadline: One day. Yes, one day.
I knew I couldn’t do this manually. So, I hatched a plan to pull data from all the tools I use daily—JIRA, Git, Slack, and Outlook—and stitch it together into a coherent timesheet.
The Tools of the Trade
Here’s what I used:
JIRA: For tracking tasks and tickets.
Git: For commit history (because every good engineer ties commits to tickets, right?).
Slack: For team communication (because meetings and messages count too).
Outlook: For calendar events (because apparently, meetings are work too).
Step 1: Pulling JIRA Tickets
My first stop was JIRA. I needed to pull all the tickets assigned to me within a specific date range. Thankfully, JIRA has a robust API, and with a little Python magic, I was able to automate this process.
The Script
Here’s the Python script I wrote to fetch JIRA tickets:
import os
from jira import JIRA
import pandas as pd
from datetime import datetime
import logging
import sys
from typing import List, Dict, Any
import argparse
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
class JiraTicketPuller:
def __init__(self, server: str, email: str, api_token: str):
"""
Initialize JIRA client with authentication credentials.
Args:
server: JIRA server URL
email: User's email for authentication
api_token: JIRA API token
"""
try:
self.jira = JIRA(server=server, basic_auth=(email, api_token))
logger.info("Successfully connected to JIRA")
except Exception as e:
logger.error(f"Failed to connect to JIRA: {str(e)}")
sys.exit(1)
def get_user_tickets(
self,
username: str,
start_date: str,
end_date: str
) -> List[Dict[str, Any]]:
"""
Retrieve all tickets assigned to a user within a date range.
Args:
username: JIRA username to search for
start_date: Start date in YYYY-MM-DD format
end_date: End date in YYYY-MM-DD format
Returns:
List of dictionaries containing ticket information
"""
try:
# Construct JQL query
jql_query = (
f'assignee = {username} AND '
f'created >= {start_date} AND '
f'created <= {end_date} '
'ORDER BY created DESC'
)
# Fields to retrieve
fields = [
'summary',
'status',
'priority',
'issuetype',
'created',
'updated',
'project',
'components',
'description'
]
# Get issues using JQL
issues = self.jira.search_issues(
jql_query,
maxResults=1000, # Adjust based on your needs
fields=fields
)
tickets_data = []
for issue in issues:
ticket = {
'Key': issue.key,
'Summary': issue.fields.summary,
'Status': str(issue.fields.status),
'Priority': str(issue.fields.priority),
'Issue Type': str(issue.fields.issuetype),
'Created': issue.fields.created[:10], # Get date only
'Updated': issue.fields.updated[:10], # Get date only
'Project': str(issue.fields.project),
'Components': ', '.join([str(c) for c in issue.fields.components]),
'Description': issue.fields.description or ''
}
tickets_data.append(ticket)
logger.info(f"Retrieved {len(tickets_data)} tickets for user {username}")
return tickets_data
except Exception as e:
logger.error(f"Error retrieving tickets: {str(e)}")
return []
def export_to_csv(self, tickets: List[Dict[str, Any]], output_file: str):
"""
Export tickets data to CSV file.
Args:
tickets: List of ticket dictionaries
output_file: Path to output CSV file
"""
try:
if not tickets:
logger.warning("No tickets to export")
return
df = pd.DataFrame(tickets)
df.to_csv(output_file, index=False, encoding='utf-8')
logger.info(f"Successfully exported tickets to {output_file}")
except Exception as e:
logger.error(f"Error exporting to CSV: {str(e)}")
def validate_date(date_str: str) -> bool:
"""Validate date string format (YYYY-MM-DD)."""
try:
datetime.strptime(date_str, '%Y-%m-%d')
return True
except ValueError:
return False
def main():
# Set up argument parser
parser = argparse.ArgumentParser(
description='Pull JIRA tickets for a user within a date range'
)
parser.add_argument('--username', required=True, help='JIRA username')
parser.add_argument('--start-date', required=True, help='Start date (YYYY-MM-DD)')
parser.add_argument('--end-date', required=True, help='End date (YYYY-MM-DD)')
parser.add_argument(
'--output',
default='jira_tickets.csv',
help='Output CSV file path'
)
args = parser.parse_args()
# Validate dates
if not all(validate_date(date) for date in [args.start_date, args.end_date]):
logger.error("Invalid date format. Please use YYYY-MM-DD")
sys.exit(1)
# Configuration
JIRA_SERVER = os.getenv("JIRA_SERVER")
JIRA_EMAIL = os.getenv("JIRA_EMAIL")
JIRA_API_TOKEN = os.getenv("JIRA_API_TOKEN")
# Initialize ticket puller
puller = JiraTicketPuller(JIRA_SERVER, JIRA_EMAIL, JIRA_API_TOKEN)
# Get and export tickets
tickets = puller.get_user_tickets(
args.username,
args.start_date,
args.end_date
)
puller.export_to_csv(tickets, args.output)
if __name__ == "__main__":
# Ensure JIRA API token is set
if not os.getenv("JIRA_API_TOKEN"):
logging.error("JIRA_API_TOKEN environment variable not set")
return
main()
How It Works
Authentication: The script uses your JIRA email and API token to authenticate.
JQL Query: It constructs a JQL query to fetch tickets assigned to you within a date range.
Data Export: The results are exported to a CSV file for easy analysis.
Step 2: Fetching Git Commits
Next, I turned to Git. Since our team follows a practice of including JIRA ticket IDs in commit messages, I wrote a script to extract commit data and map it to tickets.
The Script
import subprocess
import csv
import re
def get_git_commits(since_date=None, author=None):
# Get commit data with full commit message
cmd = ['git', 'log', '--pretty=format:%h|%ad|%s', '--date=iso']
if since_date:
cmd.extend(['--since', since_date])
if author:
cmd.extend(['--author', author])
result = subprocess.run(cmd, capture_output=True, text=True)
commits = result.stdout.strip().split('\n')
# Process commits to extract JIRA IDs
processed_commits = []
jira_pattern = re.compile(r'JIRA:\s*([A-Z]+-\d+)')
for commit in commits:
if commit: # Skip empty lines
hash_id, date, message = commit.split('|', 2)
jira_match = jira_pattern.search(message)
jira_id = jira_match.group(1) if jira_match else ''
processed_commits.append({
'date': date,
'commit': hash_id,
'jira_id': jira_id
})
# Write to CSV
csv_filename = 'git_commits_jira.csv'
fieldnames = ['date', 'commit', 'jira_id']
with open(csv_filename, 'w', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(processed_commits)
return processed_commits
How It Works
Git Log: The script uses
git log
to fetch commit history.JIRA ID Extraction: It uses regex to extract JIRA ticket IDs from commit messages.
CSV Export: The results are saved to a CSV file for later use.
Step 3: Tackling Slack Messages
Slack was trickier. Messages are context-heavy, and mapping them to specific tasks isn’t straightforward. I briefly considered using AI to parse the data but decided against it due to cost and complexity. Instead, I created a generic ticket to capture communication time and wrote a script to fetch Slack messages.
The Script
import os
from datetime import datetime
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
import pandas as pd
def get_all_conversations(client):
"""
Get all conversations (channels, private channels, and DMs) the bot has access to.
Args:
client: Slack WebClient instance
Returns:
list: List of conversation objects
"""
conversations = []
try:
# Get all conversations (channels, private channels, and DMs)
result = client.conversations_list(
types="public_channel,private_channel,im,mpim",
limit=1000
)
conversations.extend(result["channels"])
# Handle pagination
while result.get("response_metadata", {}).get("next_cursor"):
cursor = result["response_metadata"]["next_cursor"]
result = client.conversations_list(
types="public_channel,private_channel,im,mpim",
cursor=cursor,
limit=1000
)
conversations.extend(result["channels"])
except SlackApiError as e:
print(f"Error fetching conversations: {e.response['error']}")
return conversations
def get_user_info(client, user_id):
"""
Get user information from their ID.
Args:
client: Slack WebClient instance
user_id: User's Slack ID
Returns:
dict: User information including real name and email
"""
try:
result = client.users_info(user=user_id)
user = result["user"]
return {
"real_name": user.get("real_name", "Unknown"),
"email": user.get("profile", {}).get("email", "Unknown"),
"display_name": user.get("profile", {}).get("display_name", "Unknown")
}
except SlackApiError:
return {"real_name": "Unknown", "email": "Unknown", "display_name": "Unknown"}
def get_messages_from_conversation(client, channel_id, channel_name, start_date, end_date):
"""
Retrieve messages from a specific conversation within a date range.
Args:
client: Slack WebClient instance
channel_id (str): The ID of the Slack channel
channel_name (str): The name of the Slack channel
start_date (str): Start date in format 'YYYY-MM-DD'
end_date (str): End date in format 'YYYY-MM-DD'
Returns:
list: List of messages within the date range
"""
messages = []
start_timestamp = int(datetime.strptime(start_date, '%Y-%m-%d').timestamp())
end_timestamp = int(datetime.strptime(end_date, '%Y-%m-%d').timestamp())
try:
# Initialize the cursor for pagination
cursor = None
while True:
# Get messages using conversations.history
result = client.conversations_history(
channel=channel_id,
oldest=start_timestamp,
latest=end_timestamp,
limit=100,
cursor=cursor
)
# Process messages
for msg in result['messages']:
user_id = msg.get('user')
user_info = get_user_info(client, user_id) if user_id else {
"real_name": "Unknown",
"email": "Unknown",
"display_name": "Unknown"
}
message_data = {
'channel_name': channel_name,
'channel_id': channel_id,
'text': msg.get('text', ''),
'user_id': user_id,
'user_name': user_info['real_name'],
'user_email': user_info['email'],
'timestamp': datetime.fromtimestamp(float(msg['ts'])).strftime('%Y-%m-%d %H:%M:%S'),
'thread_ts': msg.get('thread_ts', None)
}
messages.append(message_data)
# Check if there are more messages to fetch
cursor = result.get('response_metadata', {}).get('next_cursor')
if not cursor:
break
except SlackApiError as e:
print(f"Error fetching messages from {channel_name}: {e.response['error']}")
return messages
def main():
# Set your date range
start_date = "2024-01-01"
end_date = "2024-01-31"
# Initialize the Slack client
slack_token = os.environ.get('SLACK_BOT_TOKEN')
if not slack_token:
raise ValueError("Please set the SLACK_BOT_TOKEN environment variable")
client = WebClient(token=slack_token)
# Get all conversations
print("Fetching all conversations...")
conversations = get_all_conversations(client)
# Fetch messages from all conversations
all_messages = []
total_conversations = len(conversations)
print(f"Found {total_conversations} conversations. Fetching messages...")
for idx, conv in enumerate(conversations, 1):
channel_id = conv["id"]
channel_name = conv.get("name", "DM") # DMs don't have names
print(f"Processing {idx}/{total_conversations}: {channel_name}")
messages = get_messages_from_conversation(
client,
channel_id,
channel_name,
start_date,
end_date
)
all_messages.extend(messages)
# Convert to DataFrame for easier handling
df = pd.DataFrame(all_messages)
# Save to CSV
output_file = f"slack_messages_{start_date}_to_{end_date}.csv"
df.to_csv(output_file, index=False, encoding='utf-8')
print(f"\nTotal messages retrieved: {len(all_messages)}")
print(f"Messages saved to: {output_file}")
# Print summary
print("\nMessages per channel:")
channel_summary = df.groupby('channel_name').size().sort_values(ascending=False)
print(channel_summary)
if __name__ == "__main__":
main()
How It Works
Conversation List: The script fetches all channels and DMs accessible to the bot.
Message Retrieval: It retrieves messages within a specified date range.
CSV Export: The messages are saved to a CSV file for further analysis.
Step 4: Capturing Outlook Meetings
Finally, I needed to account for meetings. Using the exchangelib
Python library, I wrote a script to pull calendar events and export them to a CSV.
The Script
from exchangelib import Credentials, Account
import pandas as pd
from datetime import datetime, timedelta
def get_outlook_meetings():
# Replace with your email and password
email = os.environ.get("JIRA_EMAIL")
password = os.environ.get('OUTLOOK_PASSWORD')
# Connect to Exchange
credentials = Credentials(email, password)
account = Account(email, credentials=credentials, autodiscover=True)
# Get calendar items for the last 30 days (adjust as needed)
start_date = datetime.now() - timedelta(days=30)
end_date = datetime.now() + timedelta(days=30)
# Get calendar items
calendar_items = account.calendar.view(
start=start_date,
end=end_date
)
# Prepare data structure
meetings = []
# Process each appointment
for item in calendar_items:
# Calculate duration in minutes
duration = int((item.end - item.start).total_seconds() / 60)
meetings.append({
'date': item.start.strftime('%Y-%m-%d'),
'time': item.start.strftime('%H:%M'),
'duration': duration,
'title': item.subject
})
# Create DataFrame and save to CSV
df = pd.DataFrame(meetings)
df.to_csv('outlook_meetings.csv', index=False)
return df
# Run the function
if __name__ == "__main__":
meetings_df = get_outlook_meetings()
print(f"Exported {len(meetings_df)} meetings to outlook_meetings.csv")
How It Works
Authentication: The script uses your Outlook email and password to authenticate.
Calendar Query: It fetches calendar events within a specified date range.
CSV Export: The events are saved to a CSV file.
What’s Next?
At this point, I had four CSV files:
JIRA Tickets: All the tasks I worked on.
Git Commits: All the code I wrote.
Slack Messages: All the communication I participated in.
Outlook Meetings: All the meetings I attended.
In Part 2, I’ll show you how I stitched these datasets together to create a comprehensive timesheet. Spoiler alert: It involves more Python, some data wrangling, and a little bit of magic.
Stay tuned, and remember: Laziness is the mother of invention.
What’s your least favorite chore as a software engineer? Have you automated it yet? Share your stories in the comments!
Top comments (0)