Andrea Daly

Log Tracking Component Overview


Project Purpose

The codebase is designed to trace missing records through a flow of AWS Lambda functions. It utilizes log insights to query the log group of each Lambda and identify if the missing record has passed through that Lambda and at which processing point. The program then compares a file containing a list of missing records with the records returned from the log group queries. This process helps identify matched records, proving that they reached the specified point in the Lambda processing, and unmatched records, indicating failures in the flow.

Main Components

  1. Lambda Handler (handleRequest method)

    • Purpose: Initiates the overall process by executing multiple log queries.
    • Input: AWS S3 event and Lambda execution context.
    • Output: String indicating the completion of processing.
    • Exceptions: Throws a RuntimeException in case of errors.
    • Example Usage:
    •           
                  S3Event s3Event = /* Populate S3 event */;
                  Context lambdaContext = /* Populate Lambda context */;
                  String result = handleRequest(s3Event, lambdaContext);
                
              
  2. executeMultipleLogQueries method

    • Purpose: Orchestrates the execution of multiple log queries for different Lambda functions.
    • Output: Results of log queries for each Lambda function.
    • Exceptions: Throws an exception if an error occurs during log queries.
    • Example Usage:
    •           
                  executeMultipleLogQueries();
                
              
    • Calling Structure:
      • Calls: generateParamsAndCreateLogs method.
      • calculateStartAndEndTimes method.
  3. calculateStartAndEndTimes method

    • Purpose: Computes start and end times based on environment variables or specific date-time values.
    • Input: Environment variables or specified date-time values.
    • Output: Array of long values representing start and end times.
    • Example Usage:
    •           
                  long[] times = calculateStartAndEndTimes();
                
              
    • Calling Structure:
      • Calls: None.
  4. generateParamsAndCreateLogs method

    • Purpose: Generates parameters for log queries and orchestrates their execution for multiple Lambda functions.
    • Input:
      • Start and end times
      • Array of log groups
      • Array of log queries
      • List of criteria predicates
      • List of ID extractor functions
    • Exceptions: Throws an exception if an error occurs during the process.
    • Example Usage:
    •           
                  long[] startAndEndTimes = /* Populate start and end times */;
                  String[] logGroups = /* Populate array of log groups */;
                  String[] queries = /* Populate array of log queries */;
                  List<Predicate<String>> criteria = /* Populate list of criteria predicates */;
                  List<Function<String, String>> idExtractors = /* Populate list of ID extractor functions */;
                  generateParamsAndCreateLogs(startAndEndTimes, logGroups, queries, criteria, idExtractors);
                
              
    • Calling Structure:
      • Calls: createLogInsightsLogFile method for each log group.
  5. createLogInsightsLogFile method

    • Purpose: Queries CloudWatch Logs Insights, writes the results to an output file, and performs analysis.
    • Input:
      • Log group name
      • Log query string
      • Start and end times
      • Predicate for filtering log lines
      • Function for extracting IDs from log lines
      • Output file path
      • Message indicating the context of the analysis
    • Exceptions: Throws an exception if an error occurs during CloudWatch Logs operations.
    • Example Usage:
    •           
                  String logGroupName = /* Populate log group name */;
                  String queryString = /* Populate log query string */;
                  long startTime = /* Populate start time */;
                  long endTime = /* Populate end time */;
                  Predicate<String> criteria = /* Populate criteria predicate */;
                  Function<String, String> idExtractor = /* Populate ID extractor function */;
                  String outputFile = /* Populate output file path */;
                  String message = /* Populate analysis context message */;
                  createLogInsightsLogFile(queryString, logGroupName, startTime, endTime, criteria, idExtractor, outputFile, message);
                
              
    • Calling Structure:
      • Calls: performAnalysis method.
  6. performAnalysis method

    • Purpose: Compares records from two files, identifying matched and unmatched records based on specific criteria.
    • Input:
      • Log group name
      • Predicate for filtering log lines
      • Function for extracting IDs from log lines
      • Output file path
      • Message indicating the context of the analysis
    • Exceptions: Throws an exception if an error occurs during file reading or analysis.
    • Example Usage:
    •           
                  String logGroupName = /* Populate log group name */;
                  Predicate<String> criteria = /* Populate criteria predicate */;
                  Function<String, String> idExtractor = /* Populate ID extractor function */;
                  String outputFile = /* Populate output file path */;
                  String message
      
       = /* Populate analysis context message */;
                  performAnalysis(logGroupName, criteria, idExtractor, outputFile, message);
                
              
    • Calling Structure:
      • Calls: printLogsToOutputFiles method, printMatchedAndUnmatchedRecords method.
  7. printLogsToOutputFiles method

    • Purpose: Prints logs to an output file based on specific fields.
    • Input:
      • Output file path
      • CloudWatch Logs query results
    • Example Usage:
    •           
                  String outputFile = /* Populate output file path */;
                  GetQueryResultsResponse getQueryResultsResponse = /* Populate CloudWatch Logs query results */;
                  printLogsToOutputFiles(outputFile, getQueryResultsResponse);
                
              
  8. printMatchedAndUnmatchedRecords method

    • Purpose: Prints matched and unmatched records based on specific criteria.
    • Input:
      • Log group name
      • Message indicating the context of the analysis
      • Set of IDs from the file
      • Set of correlation IDs from the log query results
    • Example Usage:
    •           
                  String logGroupName = /* Populate log group name */;
                  String message = /* Populate analysis context message */;
                  Set<String> ids = /* Populate set of IDs from the file */;
                  Set<String> correlationIds = /* Populate set of correlation IDs from the log query results */;
                  printMatchedAndUnmatchedRecords(logGroupName, message, ids, correlationIds);
                
              
  9. extractId method

    • Purpose: Extracts an ID from a log line based on a specified split character.
    • Input: Log line and split character.
    • Output: Extracted ID.
    • Example Usage:
    •           
                  String logLine = /* Populate log line */;
                  char splitChar = /* Populate split character */;
                  String extractedId = extractId(logLine, splitChar);
                
              
    • Calling Structure:
      • Calls: None.

Design Pattern : Template Method

  • Template Class (`ProcessController`):
    • The `ProcessController` class provides a template for executing a series of steps to handle S3 events.
    • The `handleRequest` method acts as the template method. It defines the overall structure of handling an S3 event, including logging, executing multiple log queries, and completing the processing.
  • Concrete Methods (`calculateStartAndEndTimes`, `performAnalysis`, `createLogInsightsLogFile`, `extractId`):
    • These methods within the `ProcessController` class represent specific steps of the algorithm.
    • They are declared in the template class but may be overridden by concrete subclasses.

Purpose and Benefits:

  • Purpose:
    • The purpose of the Template Method Pattern is to provide a common structure for a series of related algorithms, allowing code reuse and promoting consistency across subclasses.
  • Benefits:
    • Code Reusability: The template class contains the common algorithm, and concrete subclasses can reuse this code without duplicating the structure.
    • Consistency: The pattern enforces a consistent algorithm structure across subclasses, making it easier to maintain and understand.

Common Use Cases:

  • The Template Method Pattern is commonly used when a set of algorithms have a similar structure, but the specifics of each algorithm can vary.
  • It's suitable for situations where there is a need for code reuse and a desire to avoid code duplication.

Examples from the Codebase:

  1. Template Method (`handleRequest`):
            
       @Override
       public String handleRequest(S3Event event, Context context) {
           try {
               logger.log("INFO: In Handler");
               executeMultipleLogQueries();
               // generalTrackingService.executeMultipleLogQueriesGeneral();
           } catch (Exception e) {
               logger.log("ERROR: Could not complete operation, Error message: " + e.getMessage());
               throw new RuntimeException();
           } finally {
               logger.log("INFO: Processing Complete");
           }
           return "Finished processing";
       }
            
          
  2. Concrete Method (`calculateStartAndEndTimes`):
            
       public long[] calculateStartAndEndTimes() {
           // ...
           return new long[]{startTime, endTime};
       }
            
          
  3. Concrete Method (`performAnalysis`):
            
       public void performAnalysis(String logGroupName, Predicate criteria, Function idExtractor, String outputFile, String message) throws Exception {
           // ...
       }
            
          
  4. Concrete Method (`createLogInsightsLogFile`):
            
       public void createLogInsightsLogFile(String queryString, String logGroupName, long startTime, long endTime, Predicate criteria, Function idExtractor, String outputFile, String message) throws Exception {
           // ...
       }
            
          
  5. Concrete Method (`extractId`):
            
       public String extractId(String line, char splitChar) {
           // ...
       }
            
          

In summary, the Template Method Pattern in this codebase provides a structured approach for handling S3 events while allowing flexibility in implementing specific steps of the algorithm in concrete methods.

API Proposal

To convert the provided code into a RESTful API and document its endpoints, we need to define the API endpoints, their purposes, input parameters, expected responses, and any authentication/authorization requirements. Since the original code appears to perform log analysis using AWS CloudWatch Logs, we can create endpoints related to log analysis.

Assuming you are using a Java framework like Spring Boot for your RESTful API, let's define the API endpoints:

  1. Endpoint: /api/analyze-logs

    • Purpose: Perform log analysis based on the specified criteria.
    • HTTP Method: POST
    • Input Parameters:
      • logGroupName (String): The name of the CloudWatch log group.
      • startTime (Long): The start time for log analysis (epoch milliseconds).
      • endTime (Long): The end time for log analysis (epoch milliseconds).
      • query (String): The CloudWatch Logs Insights query.
    • Authentication/Authorization: AWS credentials or an authentication mechanism for accessing CloudWatch Logs.
    • Expected Response: JSON response containing the analysis results.
    {
      "matchingRecords": 10,
      "unmatchedRecords": 5,
      "matchedIds": ["id1", "id2", ...],
      "unmatchedIds": ["id3", "id4", ...]
    }
                
  2. Endpoint: /api/get-log-groups

    • Purpose: Retrieve a list of available CloudWatch log groups.
    • HTTP Method: GET
    • Input Parameters: None
    • Authentication/Authorization: AWS credentials or an authentication mechanism for accessing CloudWatch Logs.
    • Expected Response: JSON array containing the list of log group names.
    ["logGroup1", "logGroup2", ...]
                
  3. Endpoint: /api/get-log-analysis-params

    • Purpose: Retrieve parameters for log analysis (e.g., time range options, predefined queries).
    • HTTP Method: GET
    • Input Parameters: None
    • Authentication/Authorization: AWS credentials or an authentication mechanism for accessing CloudWatch Logs.
    • Expected Response: JSON object containing analysis parameter details.
    {
      "timeRangeOptions": ["Last 24 hours", "Last 7 days", ...],
      "predefinedQueries": {"query1": "Description 1", "query2": "Description 2", ...}
    }
                
  4. Endpoint: /api/execute-multiple-log-queries

    • Purpose: Execute multiple log queries based on predefined criteria.
    • HTTP Method: POST
    • Input Parameters:
      • timeRange (String): The time range for log analysis.
    • Authentication/Authorization: AWS credentials or an authentication mechanism for accessing CloudWatch Logs.
    • Expected Response: JSON response indicating the success of the operation.
    {
      "status": "success",
      "message": "Log queries executed successfully."
    }
                

Note: These endpoint definitions are just examples, and you may need to adjust them based on your specific requirements and the framework you are using. Additionally, make sure to include proper error handling and consider security aspects when implementing the API.

Technologies and Languages

  • AWS Lambda: Serverless compute service.
  • AWS CloudWatch Logs: Service for log data storage and analysis.
  • Java: Programming language used for Lambda function development.

Conclusion

This documentation not only provides a high-level overview of the program but also integrates the calling structure of the methods in a logical order.