|
1 | | -# replace this |
| 1 | +# Serverless Website Analytics |
| 2 | + |
| 3 | +This is a CDK serverless website analytics solution that can be deployed to AWS. This construct creates backend, |
| 4 | +frontend and the ingestion APIs. |
| 5 | + |
| 6 | +This solution was designed for multiple websites with low to moderate traffic. It is designed to be as cheap as |
| 7 | +possible, but it is not free. The cost is mostly driven by the ingestion API that saves the data to S3 through a |
| 8 | +Kinesis Firehose. |
| 9 | + |
| 10 | +You can see a LIVE DEMO [HERE](https://demo.serverless-website-analytics.com/). |
| 11 | + |
| 12 | +## Objectives |
| 13 | +- Easy to deploy in your AWS account, any *region |
| 14 | +- The target audience is small to medium websites with low to moderate traffic (less than 10M views) |
| 15 | +- Lowest possible cost |
| 16 | +- KISS |
| 17 | +- No direct server side state |
| 18 | +- Low maintenance |
| 19 | + |
| 20 | +The main objective is to keep it simple and the operational cost low, keeping true to "scale to 0" tenants of serverless, |
| 21 | +even if it goes against "best practices". |
| 22 | + |
| 23 | +## Getting started |
| 24 | + |
| 25 | +### Serverside setup |
| 26 | + |
| 27 | +> ⚠️ Requires your project `aws-cdk` and `aws-cdk-lib` > 2.79.1 |
| 28 | +
|
| 29 | +Install the [CDK construct library](https://www.npmjs.com/package/serverless-website-analytics) in your project: |
| 30 | +``` |
| 31 | +npm install serverless-website-analytics |
| 32 | +``` |
| 33 | + |
| 34 | +Add the construct to your stack: |
| 35 | +```typescript |
| 36 | +import { ServerlessWebsiteAnalytics } from 'serverless-website-analytics'; |
| 37 | + |
| 38 | +export class App extends cdk.Stack { |
| 39 | + constructor(scope: Construct, id: string, props?: cdk.StackProps) { |
| 40 | + super(scope, id, props); |
| 41 | + |
| 42 | + ... |
| 43 | + |
| 44 | + new Swa(this, 'swa-demo-codesnippet-screenshot', { |
| 45 | + environment: 'prod', |
| 46 | + awsEnv: { |
| 47 | + account: this.account, |
| 48 | + region: this.region, |
| 49 | + }, |
| 50 | + sites: ['example.com', 'tests1', 'tests2'], |
| 51 | + allowedOrigins: ['*'], |
| 52 | + /* None and Basic Auth also available, see options below */ |
| 53 | + auth: { |
| 54 | + cognito: { |
| 55 | + loginSubDomain: 'login', |
| 56 | + users: [ |
| 57 | + { name: '<full name>', email: '<[email protected]>' }, |
| 58 | + ] |
| 59 | + } |
| 60 | + }, |
| 61 | + /* Optional, if not specified uses default CloudFront and Cognito domains */ |
| 62 | + domain: { |
| 63 | + name: 'demo.serverless-website-analytics.com', |
| 64 | + certificate: wildCardCertUsEast1, |
| 65 | + /* Optional, if not specified then no DNS records will be created. You will have to create the DNS records yourself. */ |
| 66 | + hostedZone: route53.HostedZone.fromHostedZoneAttributes(this, 'HostedZone', { |
| 67 | + hostedZoneId: 'Z00387321EPPVXNC20CIS', |
| 68 | + zoneName: 'demo.serverless-website-analytics.com', |
| 69 | + }), |
| 70 | + } |
| 71 | + }); |
| 72 | + |
| 73 | + } |
| 74 | +} |
| 75 | +``` |
| 76 | + |
| 77 | +Quick option rundown: |
| 78 | +- `sites`: The list of allowed sites. This does not have to be a domain name, it can also be string. It can be anything |
| 79 | + you want to use to identify a site. The client side script that send analytics will have to specify one of these names. |
| 80 | +- `allowedOrigins`: The origins that are allowed to make requests to the backend Ingest API. This CORS check is done as an extra |
| 81 | + security measure to prevent other sites from making requests to your backend. It must include the protocol and |
| 82 | + full domain. Ex: If your site is `example.com` and it can be accessed using `https://example.com` and |
| 83 | + `https://www.example.com` then both need to be listed. A value of `*` is specifies all origins are allowed. |
| 84 | +- `auth`: The auth configuration which defaults to none. If you want to enable auth, you can specify either Basic Auth or |
| 85 | + Cognito auth but not both. |
| 86 | + - `undefined`: If not specified, then no authentication is applied, everything is publicly available. |
| 87 | + - `basicAuth`: Uses a CloudFront function to validate the Basic Auth credentials. The credentials are hard coded in |
| 88 | + the Lambda function. This is not the recommended for production, it also only secures the HTML page, the API is still |
| 89 | + accessible without auth. |
| 90 | + - `cognito`: Uses an AWS Cognito user pool. Users will get a temporary password via email after deployment. They will |
| 91 | + then be prompted to change their password on first login. This is the recommended option for production as it uses |
| 92 | + JWT tokens to secure the API as well. |
| 93 | +- `domain`: If specified, it will create the CloudFront and Cognito resources at the specified domain and optionally |
| 94 | + create the DNS records in the specified Route53 hosted zone. If not specified, it uses the default autogenerated |
| 95 | + CloudFront(`cloudfront.net`) and Cognito(`auth.us-east-1.amazoncognito.com`) domains. You can read the website URL |
| 96 | + from the stack output. |
| 97 | + |
| 98 | +For a full list of options see the [API.md](docs/API.md) docs. |
| 99 | + |
| 100 | +### Client side setup |
| 101 | + |
| 102 | +> ⚠️ IMPORTANT! **After** the client sent the first data, you have to click on the "Add Partitions" button in the |
| 103 | +> frontend to auto discover and add the site, month, day partitions. Otherwise, the data will not show up in the charts. |
| 104 | +> This operation has to be repeated at the beginning of every month. |
| 105 | +
|
| 106 | +Install the [client side library](https://www.npmjs.com/package/serverless-website-analytics-client): |
| 107 | +``` |
| 108 | +npm install serverless-website-analytics-client |
| 109 | +``` |
| 110 | + |
| 111 | +Irrelevant of the framework, you have to do the following to track page views on your site: |
| 112 | + |
| 113 | +1. Initialize the client only once with `analyticsPageInit`. The site name must correspond with one that you specified |
| 114 | + when deploying the `serverless-website-analytics` backend. You also need the URL to the backend. Make sure your frontend |
| 115 | + site's `Origin` is whitelisted in the backend config. |
| 116 | +2. On each route change call the `analyticsPageChange` function with the name of the new page. |
| 117 | + |
| 118 | +The following sections show you how to do it in Vue, see [the readme of the client](https://github.com/rehanvdm/serverless-website-analytics-client-development#usage) |
| 119 | +for various framework setups. |
| 120 | + |
| 121 | +#### Vue |
| 122 | + |
| 123 | +[_./serverless-website-analytics-client/usage/vue/vue-project/src/main.ts_](https://github.com/rehanvdm/serverless-website-analytics-client-development/blob/master/usage/vue/vue-project/src/main.ts) |
| 124 | +```typescript |
| 125 | +... |
| 126 | +import * as swaClient from 'serverless-website-analytics-client'; |
| 127 | + |
| 128 | +const app = createApp(App); |
| 129 | +app.use(router); |
| 130 | + |
| 131 | +swaClient.v1.analyticsPageInit({ |
| 132 | + inBrowser: true, //Not SSR |
| 133 | + site: "<Friendly site name>", //example.com |
| 134 | + apiUrl: "<Your serverless-website-analytics URL>", //https://my-serverless-website-analytics-backend.com |
| 135 | + // debug: true, |
| 136 | +}); |
| 137 | +router.afterEach((event) => { |
| 138 | + swaClient.v1.analyticsPageChange(event.path); |
| 139 | +}); |
| 140 | + |
| 141 | +app.mount('#app'); |
| 142 | +``` |
| 143 | + |
| 144 | +## What's in the box |
| 145 | + |
| 146 | +The architecture consists of four components: frontend, backend, ingestion API and the client JS library. |
| 147 | + |
| 148 | + |
| 149 | + |
| 150 | +### Frontend |
| 151 | + |
| 152 | +AWS CloudFront is used to host the frontend. The frontend is a SPA Vue 3 app that is hosted on S3 and served through |
| 153 | +CloudFront. The [Element UI Plus](https://element-plus.org/en-US/) frontend framework is used for the UI components |
| 154 | +and https://plotly.com/javascript/ for the charts. |
| 155 | + |
| 156 | + |
| 157 | + |
| 158 | + |
| 159 | + |
| 160 | +### Backend |
| 161 | + |
| 162 | +This is a Lambda-lith hit through the Lambda Function URLs (FURL) by reverse proxing through CloudFront. It is written |
| 163 | +in TypeScript and uses [tRPC](https://trpc.io/) to handle API requests. |
| 164 | + |
| 165 | +The Queries to Athena are synchronous, the connection timeout between CloudFront and the FURL has been increased |
| 166 | +to 60 seconds. |
| 167 | + |
| 168 | +There are three available authentication configurations: |
| 169 | +- **None**, it is open to the public |
| 170 | +- **Basic Authentication**, basic protection for the index.html file |
| 171 | +- **AWS Cogntio**, recommended for production |
| 172 | + |
| 173 | +⚠️ Partitions are not automatically created in Athena, they have to be created manually by the user by clicking the |
| 174 | +"Create/Refresh Partitions" button in the frontend. This has to be done when ever a new site is added or a new month |
| 175 | +starts. |
| 176 | + |
| 177 | +### Ingestion API |
| 178 | + |
| 179 | +Similarly to the backend, it is also a TS Lambda-lith that is hit through the FURL by reverse proxing through CloudFront. |
| 180 | +It also uses [tRPC](https://trpc.io/) but uses the [trpc-openapi](https://github.com/jlalmes/trpc-openapi) package to |
| 181 | +generate an OpenAPI spec. This is used to generate the API types used in the [client JS package](https://www.npmjs.com/package/serverless-website-analytics-client). |
| 182 | +and can also be used to generate other language client libraries. |
| 183 | + |
| 184 | +The lambda function then saves the data to S3 through a Kinesis Firehose. The Firehose is configured to save the data |
| 185 | +in a partitioned manner, by site, year and month. The data is saved in parquet format. Location data is obtained by |
| 186 | +looking the IP address up in the [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geoip2/geolite2/) database. We don't |
| 187 | +store any Personally Identifiable Information (PII) in the logs, the IP address is never stored. |
| 188 | + |
| 189 | +## Contributing |
| 190 | + |
| 191 | +See [CONTRIBUTING.md](docs/CONTRIBUTING.md) for more info on how to contribute + design decisions. |
| 192 | + |
| 193 | +## Roadmap |
| 194 | + |
| 195 | +Can be found in the [here](https://github.com/users/rehanvdm/projects/1/views/1) |
0 commit comments