39

I have a AWS::Event::Rule that routes a S3 put event to a ECS task. I can see the rule is being triggered from the metrics, but also see FailedInvocation on every trigger. I suspect that's a permission / policy issue, but not able to find any debug info or log. Is these debug info available somewhere?

I see a similar issue with Lambda as target, which needs an extra permission on the Lambda side to allow trigger from events, but was not able to find similar settings for ECS? AWS Cloudformation - Invocation of Lambda by Rule Event failed

Here is the related CloudFormation code, which shows the current role with the ECS target:

Resources:
  ECSTrigger:
    Type: AWS::Events::Rule
    Properties:
      ...
      Targets: # target of trigger: ECS
        - Arn:
            Fn::Sub: 'arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:cluster/${ClusterName}'
          Id: 'EcsTriggerTarget'
          InputTransformer:
            InputPathsMap:
              s3_bucket: "$.detail.requestParameters.bucketName"
              s3_key: "$.detail.requestParameters.key"
            InputTemplate: '{"containerOverrides": [{"environment": [{"name": "S3_BUCKET", "value": <s3_bucket>}, {"name": "S3_KEY", "value": <s3_key>}]}]}'
          EcsParameters:
            LaunchType: FARGATE
            PlatformVersion: LATEST
            TaskCount: 1
            TaskDefinitionArn:
              Ref: Task
            NetworkConfiguration:
              AwsVpcConfiguration:
                AssignPublicIp: DISABLED
                SecurityGroups: ...
                Subnets: ...
          RoleArn:
            Fn::GetAtt: EcsTriggerRole.Arn

  EcsTriggerRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Action: 'sts:AssumeRole'
            Principal:
              Service: 'events.amazonaws.com'
      ManagedPolicyArns:
        - Fn::Sub: 'arn:${AWS::Partition}:iam::aws:policy/service-role/AmazonEC2ContainerServiceEventsRole'
3
  • Did you make sure your task definition is correct? Also if you think it's IAM issue, try substitute AmazonEC2ContainerServiceEventsRole with AmazonEC2ContainerServiceFullAccess. Jul 16, 2019 at 19:35
  • The task it self is running as expected (producing logs). I have tried to use Admin access, but seems not working.
    – lznt
    Jul 17, 2019 at 18:16
  • CloudTrail should show the event - you may or may not get anything useful.
    – tschumann
    Mar 24, 2021 at 0:02

8 Answers 8

31

I chatted with a Support Engineer at AWS today about this issue. According to them, debugging any FailedInvocation issues must be done at the resource-level and cannot be debugged at the EventBridge-level. From our chat:

I just confirmed from internal cloudwatch team, cloudwatch do not provide any logs for failed invocation. Apart from the failedinvocation metrics, there is no logging avaialble from cloudwatch side. As mentioned, you need to rely on lambda logs or resources logs.

In other words, if your Rule invokes ECS (the resource), the only debug logs available are from ECS and not from EventBridge. I asked the support engineer to submit a feature request on my team's behalf, so you could also consider doing this via the AWS Support channels.

1
  • 6
    So let me get this straight - AWS expects us to use logs emitted by the target service, for failed target service invocations? Does anyone else see the problem here? I'm in the same position. My lambda isn't being invoked due to this error. Big surprise - there are no lambda logs... Between crap like this and the absolutely awful community documentation, AWS really, really gets on my nerves sometimes.
    – notAChance
    Aug 18, 2023 at 22:41
21

I just faced a similar situation. I had configured an EventBridge rule to run an ECS task periodically, and I was observing that the ECS task was not being invoked.

I then checked the RunTask event in CloudTrail, and there I finally found a clear error message:

User: arn:aws:sts::xxxx:assumed-role/Amazon_EventBridge_Invoke_ECS/xxx is not authorized to perform: ecs:RunTask on resource: arn:aws:ecs:us-east-1:xxxx:task-definition/ECS_task

which indicates that the role associated with the rule did not have enough permissions to pull the docker image.

5
  • What was the event name you found this under?
    – trademark
    Sep 24, 2021 at 16:02
  • @trademark "RunTask"
    – Paolo
    Oct 6, 2021 at 17:04
  • THIS helped me fix it. I was able to see a BadRequestException due to a wrong ARN which could not be found anywhere in the logs. THANK YOU!
    – Mazzy
    Nov 16, 2021 at 16:37
  • This one helped me debug, thank you!
    – ChKl
    Nov 4, 2022 at 15:58
  • This didn't exactly solve my issue but inspired me to go look at the Lambda permissions and sure enough, the Rule didn't have permission to invoke the Lambda. That ended up being my issue. Thanks! Oct 6, 2023 at 15:48
13

In my case I had a Eventbridge rule to pick up an event from AWS Config and send to a SNS Topic.

When the event was fired from AWS Config, I could see it was picked up by Event Bridge under the monitoring tab graphs (Invocations and FailedInvocations), but it never reached the SNS topic.

This was extremely hard to debug. I couldnt find any information from Cloudwatch and Cloudtrail. Finally made a breakthrough after setting up a Dead Letter Queue (Created from SQS) to grab failed deliveries of my target.

When inspecting the DLQ I could see that

enter image description here

There was something wrong with my Input Transformer. So I highly suggest setting up a DLQ for your rules for more information about unprocessed events.

2

It seems the issue is I missed a "name" inside "containerOverrides" in InputTemplate, it works when I put it this way:

            InputTemplate:
              Fn::Sub: >-
                { "containerOverrides": [ {
                  "name": "${ServiceContainerName}",
                  "environment": [
                    { "name":"S3_BUCKET", "value":<s3_bucket> },
                    { "name":"S3_KEY", "value":<s3_key> } ]
                } ] }
1
  • 1
    But still, i was not able to find any debug information for the FailedInvocation... : (
    – lznt
    Jul 19, 2019 at 4:31
2

I got this error due to some missing permissions, because I created the resources with terraform, so I was making wrong assumptions.

If you are doing this, the best solution is to create a test rule in the AWS console manually, and it will create the right IAM role for you. From here I just copied the same permissions in my terraform policy and was able to make it work.

enter image description here

1

These two things tripped me up:

  1. I did not specify my Launch Type to be FARGATE which is required for my ECS task.
  2. I re-used the role from a previous Event but the policy for this role gave access to the wrong ECS task. Let it create you a new role or if you use an existing role, then ensure to give that role permissions to execute the added ECS task.
1

As of May 2023, you can migrate your cloudwatch rules over to EventBridge.

There you can specify that invocation failures get sent to an SQS message queue.

The queue message will give you enough information to diagnose the problem.

0

In my case the event to search in the CloudTrail was NotifyEvent. I was trying to invoke Glue Workflow on the event of Datasync task succeeded. Hope it helps. Interesting to notice that the DeadQueue did not helped - there were no messages

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.