vince

API first high performance self hosted and cost effective privacy friendly web analytics server for organizations of any size

What is vince ?

Vince is a modern server for collecting and analyzing website analytics. Vince focuses on modern web application development by emphasizing easy of use for both deployment, maintenance and integration with existing infrastructure.

It ships with a standalone binary with zero dependency called vince

Features



Getting started

Installation

Installation script

curl -fsSL https://vinceanalytics.com/install.sh | bash

Homebrew

brew install vinceanalytics/tap/vince

Container image

docker pull ghcr.io/vinceanalytics/vince

Starting server

vince --data=vince-data --domains=example.com --nodeId=1

This will start vince server listening on port 8080

Check if your server is up and running

$ curl http://localhost:8080/version
{
  "version": "v0.0.62"
}

AddScript to your website

To integrate your website with Vince Analytics, you need to be able to update the HTML code of the website you want to track. Paste your Vince Analytics tracking script code into the Header (<head>) section of your site. Place the tracking script within the <head> … </head> tags.

Your Vince Analytics tracking script code will look something like this.

note: data-api is the url to where vince instance is listening and accept events, the events path is /api/event data-domain is the site_id/domain name you want to monitor for events

Development script on localhost

<script data-domain="example.com" data-api="http://localhost:8080/api/event">
(function(){"use strict";var t,n,o,a,l,r=window.location,e=window.document,c=e.currentScript,h=c.getAttribute("data-api")||d(c);function u(e){console.warn("Ignoring Event: "+e)}function d(e){return new URL(e.src).origin+"/api/event"}function i(t,n){try{if(window.localStorage.vince_ignore==="true")return u("localStorage flag")}catch{}var o,s={};s.n=t,s.u=r.href,s.d=c.getAttribute("data-domain"),s.r=e.referrer||null,s.w=window.innerWidth,n&&n.meta&&(s.m=JSON.stringify(n.meta)),n&&n.props&&(s.p=n.props),o=new XMLHttpRequest,o.open("POST",h,!0),o.setRequestHeader("Content-Type","text/plain"),o.send(JSON.stringify(s)),o.onreadystatechange=function(){o.readyState===4&&n&&n.callback&&n.callback()}}a=window.vince&&window.vince.q||[],window.vince=i;for(t=0;t<a.length;t++)i.apply(this,a[t]);function s(){if(o===r.pathname)return;o=r.pathname,i("pageview")}n=window.history,n.pushState&&(l=n.pushState,n.pushState=function(){l.apply(this,arguments),s()},window.addEventListener("popstate",s));function m(){!o&&e.visibilityState==="visible"&&s()}e.visibilityState==="prerender"?e.addEventListener("visibilitychange",m):s()})()
</script>

Production script

<script data-domain="vinceanalytics.com" data-api="http://api.vinceanalytics.com/api/event">
(function(){"use strict";var n,s,i,r,d,t=window.location,e=window.document,c=e.currentScript,h=c.getAttribute("data-api")||u(c);function l(e){console.warn("Ignoring Event: "+e)}function u(e){return new URL(e.src).origin+"/api/event"}function a(n,s){if(/^localhost$|^127(\.[0-9]+){0,2}\.[0-9]+$|^\[::1?\]$/.test(t.hostname)||t.protocol==="file:")return l("localhost");if(window._phantom||window.__nightmare||window.navigator.webdriver||window.Cypress)return;try{if(window.localStorage.vince_ignore==="true")return l("localStorage flag")}catch{}var i,o={};o.n=n,o.u=t.href,o.d=c.getAttribute("data-domain"),o.r=e.referrer||null,o.w=window.innerWidth,s&&s.meta&&(o.m=JSON.stringify(s.meta)),s&&s.props&&(o.p=s.props),i=new XMLHttpRequest,i.open("POST",h,!0),i.setRequestHeader("Content-Type","text/plain"),i.send(JSON.stringify(o)),i.onreadystatechange=function(){i.readyState===4&&s&&s.callback&&s.callback()}}r=window.vince&&window.vince.q||[],window.vince=a;for(n=0;n<r.length;n++)a.apply(this,r[n]);function o(){if(i===t.pathname)return;i=t.pathname,a("pageview")}s=window.history,s.pushState&&(d=s.pushState,s.pushState=function(){d.apply(this,arguments),o()},window.addEventListener("popstate",o));function m(){!i&&e.visibilityState==="visible"&&o()}e.visibilityState==="prerender"?e.addEventListener("visibilitychange",m):o()})()
</script>

After adding the script no further configuration on the website is needed. When a user visits the website events will be sent to your Vince Analytics instance.



Configuration

vince has three way of passing configuration, commandline flags, environment variables and configuration file.

All three ways can be combined to form a secure deployments. The level of precedence follows cli -> env -> file. So if lets say listen is provided by all ways, then the value set in file will be used.

We recommend using commandline flags and environment variables. Anything that can be expressed in configuration file can also be expressed with commandline flags and environment variables.

Data

Path to directory where vince will store persisting data. This option is required for vince to be operational.

env
VINCE_DATA example VINCE_DATA=path/to/storage
flag
--data example --data=path/to/storage
file
data
{"data":"path/to/storage"}

Listen

HTTP host:port that vince server will listen for http requests

env
VINCE_LISTEN example VINCE_LISTEN=:8080
flag
--listen example --listen=:8080
file
data
{"listen":":8080"}

Rate Limit

A float representing requests/second. This is applied on /api/event and /api/v1/event protect them against excessive incoming requests.

By default there is no limit.

Limits are applied globally across all sites. We don’t support per site limits yet, but it is something planned for the future releases

env
VINCE_RATE_LIMIT example VINCE_RATE_LIMIT=1.7976931348623157e+308
flag
--rateLimit example --rateLimit=1.7976931348623157e+308
file
rateLimit
{"rateLimit":1.7976931348623157e+308}

Granule Size

Size in bytes for indexes+parts that when reached they are compacted and stored in disk.

By default 16MB is used. You need to adjust this depending on your deployment resources.

Unless otherwise 16MB is a very conservative balance. This option is for power users. You can have production deployments sticking with the defaults.

When in memory index+arrow.Record reaches this size, we convert the arrow record to parquet file with compression enabled, then the resulting file is stored in durable key/value store with configured retention period.

So, While 16MB is in memory, what actually goes to disk is smaller than this value. We also compress the index before storing it as well.

env
VINCE_GRANULE_SIZE example VINCE_GRANULE_SIZE=16777216
flag
--granuleSize example --granuleSize=16777216
file
granuleSize
{"granuleSize":"16777216"}

Geo IP DB

Path to geo ip database. We use this to obtain city, country and region information of an event.

This is optional.

env
VINCE_GEOIP_DB example VINCE_GEOIP_DB=path/to/geoip_db
flag
--geoipDbPath example --geoipDbPath=path/to/geoip_db
file
geoipDbPath
{"geoipDbPath":"path/to/geoip_db"}

Domains

A list of domains managed by vince instance. To send events to vince there is no extra configuration on the client side.

Events submitted that have no registered domain are rejected. If you no longer manage the site simply removing it on this list will stop accepting events from it.

Domain is the hostname. For example https://vinceanalytics.com has domain vinceanalytics.com

env
VINCE_DOMAINS example VINCE_DOMAINS=vinceanalytics.com,example.com
flag
--domains example --domains=vinceanalytics.com,example.com
file
domains
{"domains":["vinceanalytics.com","example.com"]}

Configuration file path

Path to configuration file. When provided it will be read and merged with the rest of configurations. Values set in this file takes precedence.

env
VINCE_CONFIG example VINCE_CONFIG=path/to/config.json
flag
--config example --config=path/to/config.json

Log level

How much will be logged on stdout

Default is INFO. Values are INFO,DEBUG, WARN and ERROR.

env
VINCE_LOG_LEVEL example VINCE_LOG_LEVEL=INFO
flag
--logLevel example --logLevel=INFO

Retention Period

How long data will stay in permanent storage. Older data will automatically be deleted and the space reclaimed.

Default is 30 days.

env
VINCE_RETENTION_PERIOD example VINCE_RETENTION_PERIOD=720h0m0s
flag
--retentionPeriod example --retentionPeriod=720h0m0s
file
retentionPeriod
{"retentionPeriod":"2592000s"}

Automatic TLS

vince supports automatic tls using acme client with let’s encrypt.

Enabling automatic tls

env
VINCE_AUTO_TLS example VINCE_AUTO_TLS=true
flag
--autoTls example --autoTls=true
file
autoTls
{"autoTls":"true"}

You need to setup account email address and the domain to generate certificate for, using acmeEmail and acmeDomain options.

acmeEmail

This is used by CAs, such as Let’s Encrypt, to notify about problems with issued certificates.

env
VINCE_ACME_EMAIL example VINCE_ACME_EMAIL=example@example.org
flag
--acmeEmail example --acmeEmail=example@example.org

acmeDomain

acmeDomain should be the domain name that is used to point to your server. For example we host vince instance on cloud.vinceanalytics.com so we use this as acmeDomain

env
VINCE_ACME_DOMAIN example VINCE_ACME_DOMAIN=example.org
flag
--acmeDomain example --acmeDomain=example.org

file

{"acme":{"email":"example@example.org","domain":"example.org"}}

Authorization

vince supports bearer token authorization via authToken option. All endpoints except /api/event will be protected.

authToken

When set, clients calls without this bearer token will be rejected.

This is sensitive info use env var to set it.

env
VINCE_AUTH_TOKEN example VINCE_AUTH_TOKEN=xxx
flag
--authToken example --authToken=xxx


Stats API

CC-BY-S-4.0 This section was initially copied from Plausible Analytics docs

Thevince API offers a way to retrieve your stats programmatically. It’s a read-only interface to present historical and real-time stats only. Take a look at our events API if you want to send pageviews or custom events.

The API accepts GET requests with query parameters and returns standard HTTP responses along with a JSON-encoded body. All API requests must be made over HTTPS. Calls made over plain HTTP will fail. API requests without authentication will also fail.

Each request must be authenticated with an authToken using the Bearer Token method.

Concepts

Querying the vince API will feel familiar if you have used time-series databases before. You can’t query individual records from our stats database. You can only request aggregated metrics over a certain time period.

Each request requires a site_id parameter which is the domain of your site as configured in domains.

Metrics

You can specify a metrics option in the query, to choose the metrics for each instance returned. See here for a full overview of metrics and their definitions. The metrics currently supported in Stats API are:

MetricDescription
visitorsThe number of unique visitors.
visitsThe number of visits/sessions
pageviewsThe number of pageview events
views_per_visitThe number of pageviews divided by the number of visits. Returns a floating point number. currently only supported in Aggregate and Timeseries endpoints.
bounce_rateBounce rate percentage
visit_durationVisit duration in seconds
eventsThe number of events (pageviews + custom events)

Time periods

The options are identical for each endpoint that supports configurable time periods. Each period is relative to a date parameter. The date should follow the standard ISO-8601 format. When not specified, the date field defaults to today(site.timezone). All time calculations on our backend are done in the time zone that the site is configured in.

When using a custom range, the date parameter expects two ISO-8601 formatted dates joined with a comma as follows ?period=custom&date=2021-01-01,2021-01-31. Stats will be returned for the whole date range inclusive of the start and end dates.

Properties

Each pageview and custom event in our database has some predefined properties associated with it. In other analytics tools, these are often referred to as dimensions as well. Properties can be used for filtering and breaking down your stats to drill into more depth. Here’s the full list of properties we collect automatically:

PropertyExampleDescription
page/blog/remove-google-analyticsPathname of the page where the event is triggered. You can also use an asterisk to group multiple pages (/blog*)
entry_page/homePage on which the visit session started (landing page).
exit_page/homePage on which the visit session ended (last page viewed).
sourceTwitterVisit source, populated from an url query parameter tag (utm_source, source or ref) or the Referer HTTP header.
referrert.co/fzWTE9OTPtRaw Referer header without http://, http:// or www..
utm_mediumsocialRaw value of the utm_medium query param on the entry page.
utm_sourcetwitterRaw value of the utm_source query param on the entry page.
utm_campaignprofileRaw value of the utm_campaign query param on the entry page.
utm_contentbannerRaw value of the utm_content query param on the entry page.
utm_termkeywordRaw value of the utm_term query param on the entry page.
deviceDesktopDevice type. Possible values are Desktop, Laptop, Tablet and Mobile.
browserChromeName of the browser vendor. Most popular ones are Chrome, Safari and Firefox.
browser_version88.0.4324.146Version number of the browser used by the visitor.
osMacName of the operating system. Most popular ones are Mac, Windows, iOS and Android. Linux distributions are reported separately.
os_version10.6Version number of the operating system used by the visitor.
countryUnited KingdomCountry of the visitor country.
regionEnglandRegion the visitor region.
cityLondonCity of the visitor city.

Filtering

Most endpoints support a filters query parameter to drill down into your data. You can filter by all properties described in the Properties table, using the following operators:

OperatorUsage exampleExplanation
==name==SignupSimple equality - custom event “Signup”
!=country!=TanzaniaSimple inequality - country is not Tanzania
~=page~=^/blog/.*?Regex - matches a regular expression
!~page!~^/blog/.*?Regex - matches not a regular expression

Endpoints

GET /api/v1/stats/realtime/visitors

This endpoint returns the number of current visitors on your site. A current visitor is defined as a visitor who triggered a pageview on your site in the last 5 minutes.

REQUEST

curl  "http://localhost:8080/api/v1/stats/realtime/visitors?site_id=$SITE_ID" \
  -H "Authorization: Bearer ${TOKEN}"

RESPONSE

{
  "visitors": "6"
}

Parameters


site_id REQUIRED

Domain of your site on vince.


GET /api/v1/stats/aggregate

This endpoint aggregates metrics over a certain time period.It include Unique VisitorsPageviews, Bounce rate and Visit duration. You can retrieve any number and combination of these metrics in one request.

REQUEST

curl "http://localhost:8080/api/v1/stats/aggregate?site_id=$SITE_ID&period=6mo&metrics=visitors,pageviews,bounce_rate,visit_duration" \
  -H "Authorization: Bearer ${TOKEN}"

RESPONSE

{
  "results": {
    "bounce_rate": 1,
    "pageviews": 40,
    "visit_duration": 0.00834375,
    "visitors": 26
  }
}

Parameters


site_id REQUIRED

Domain of your site on vince.


period optional

See time periods. If not specified, it will default to 30d.


metrics optional

List of metrics to aggregate. Valid options are visitors, visits, pageviews, views_per_visit, bounce_rate, visit_duration and events. If not specified, it will default to visitors.


filters optional

See filtering


GET /api/v1/stats/timeseries

This endpoint provides timeseries data over a certain time period.

REQUEST

curl "http://localhost:8080/api/v1/stats/timeseries?site_id=$SITE_ID&period=6mo" \
  -H "Authorization: Bearer ${TOKEN}"

RESPONSE

{
  "results": [
    {
      "timestamp": "2024-03-04T00:00:00Z",
      "values": {
        "visitors": 26
      }
    }
  ]
}

Parameters


site_id REQUIRED

Domain of your site on vince.


period optional

See time periods. If not specified, it will default to 30d.


filters optional

See filtering


metrics optional

Comma-separated list of metrics to show for each time bucket. Valid options are visitors, visits, pageviews, views_per_visit, bounce_rate, visit_duration and events. If not specified, it will default to visitors.


interval optional

Choose your reporting interval. Valid options are date (always) and month (when specified period is longer than one calendar month). Defaults to month for 6mo and 12mo, otherwise falls back to date.

GET /api/v1/stats/breakdown

This endpoint allows you to break down your stats by some property. If you are familiar with SQL family databases, this endpoint corresponds to running GROUP BY on a certain property in your stats, then ordering by the count.

Check out the properties section for a reference of all the properties you can use in this query.

This endpoint can be used to fetch data for Top sources, Top pages, Top countries and similar reports.

REQUEST

curl "http://localhost:8080/api/v1/stats/breakdown?site_id=$SITE_ID&period=6mo&property=source&metrics=visitors,bounce_rate&limit=5" \
  -H "Authorization: Bearer ${TOKEN}"

RESPONSE

{
  "results": [
    {
      "property": "source",
      "values": [
        {
          "key": "Ask Toolbar",
          "value": {
            "bounce_rate": 0,
            "visitors": 1
          }
        },
        {
          "key": "Softonic",
          "value": {
            "bounce_rate": 1,
            "visitors": 2
          }
        },
        {
          "key": "Seznam",
          "value": {
            "bounce_rate": 1,
            "visitors": 1
          }
        },
        {
          "key": "Google News",
          "value": {
            "bounce_rate": 1,
            "visitors": 7
          }
        },
        {
          "key": "Google Blogsearch",
          "value": {
            "bounce_rate": 1,
            "visitors": 7
          }
        },
        {
          "key": "Tiscali",
          "value": {
            "bounce_rate": 1,
            "visitors": 2
          }
        },
        {
          "key": "Alexa",
          "value": {
            "bounce_rate": 1,
            "visitors": 2
          }
        },
        {
          "key": "Google",
          "value": {
            "bounce_rate": 1,
            "visitors": 8
          }
        }
      ]
    }
  ]
}

Parameters


site_id REQUIRED

Domain of your site on vince.


property REQUIRED

Which property to break down the stats by. Valid options are listed in the properties section above.


period optional

See time periods. If not specified, it will default to 30d.


metrics optional

Comma-separated list of metrics to show for each item in breakdown. Valid options are visitors, pageviews, bounce_rate, visit_duration, visits and events. If not specified, it will default to visitors.


limit optional

Limit the number of results. Maximum value is 1000. Defaults to 100. If you want to get more than 1000 results, you can make multiple requests and paginate the results by specifying the page parameter (e.g. make the same request with page=1, then page=2, etc)


page optional

Number of the page, used to paginate results. Importantly, the page numbers start from 1 not 0.


filters optional

See filtering



Events API

The Vince Analytics Events API can be used to record pageviews and custom events. This is useful when tracking Android or iOS mobile apps, or for server side tracking.

In most cases we recommend installing Vince through provided script

Unique visitor tracking

Special care should be taken with two key headers which are used for unique visitor counting

  1. The User-Agent header
  2. The X-Forwarded-For header

If these headers are not sent exactly as required, unique visitor counting will not work as intended. Please refer to the Request headers section below for more in-depth documentation on each header separately.

Endpoints

POST /api/event

Records a pageview or custom event. When using this endpoint, it’s crucial to send the HTTP headers correctly, since these are used for unique user counting.

curl -i -X POST http://localhost:8080/api/event \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 OPR/71.0.3770.284' \
  -H 'X-Forwarded-For: 127.0.0.1' \
  -H 'Content-Type: application/json' \
  --data '{"name":"pageview","url":"http://example.com","domain":"example.com"}'

Parameters


domain REQUIRED

Domain name of the site in Vince


name REQUIRED

Name of the event. Can specify pageview which is a special type of event in Vince. All other names will be treated as custom events.


url REQUIRED

URL of the page where the event was triggered. If the URL contains UTM parameters, they will be extracted and stored. When using the script, this is set to window.location.href.

The maximum size of the URL, excluding the domain and the query string, is 2,000 characters. Additionally, URLs using the data URI scheme are not supported by the API.


referrer OPTIONAL

Referrer for this event. When using the standard tracker script, this is set to document.referrer

Referrer values are processed heavily for better usability. Consider referrer URLS like m.facebook.com/some-path and facebook.com/some-other-path. It’s intuitive to think of both of these as coming from a single source: Facebook. In the first example the referrer value would be split into source == Facebook and referrer == m.facebook.com/some-path.

Vince uses the open source referer-parser database to parse referrers and assign these source categories.

Request headers

User-Agent REQUIRED

The raw value of User-Agent is used to calculate the user_id which identifies a unique visitor in Vince.

User-Agent is also used to populate the Devices properties in vince. The device data is derived from the open source database device-detector. If your User-Agent is not showing up in your dashboard, it’s probably because it is not recognized as one in the device-detector database.

The header is required but bear in mind that browsers and some HTTP libraries automatically add a default User-Agent header to HTTP requests. In case of browsers, we would not recommend overriding the header manually unless you have a specific reason to.


X-Forwarded-For optional

Used to explicitly set the IP address of the client. If not set, the remote IP of the sender will automatically be used. Depending on your use-case: 1. If sending the event from your visitors’ device, this header does not need to be set 2. If sending the event from a backend server or proxy, make sure to override this header with the correct IP address of the client.

The raw value of the IP address is not stored in our database. The IP address is used to calculate the user_id which identifies a unique visitor in Vince. It is also used to fill the Location properties with country, region and city data of the visitor.

If the header contains a comma-separated list (as it should if the request is sent through a chain of proxies), then the first valid IP address from the list is used. Both IPv4 and IPv6 addresses are supported. More information about the header format can be found on MDN docs.


Content-Type REQUIRED

Must be either application/json or text/plain. In case of text/plain, the request body is still interpreted as JSON.



All Metrics

CC-BY-S-4.0 This section was initially copied from Plausible Analytics docs

Bounce Rate

The percentage of visitors with a single page view. A visitor “bounces” away and leaves your site after only viewing a single page.

Current Visitors

The number of people currently on your site.It includes all visitors who have loaded a page in the last 5 minutes..

Time on Page

The average time people spend on a particular page on your site. This is calculated as the difference between the point when a person lands on a particular page and when they move on to the next page.

Total Pageviews

The total number of times your pages were loaded by your visitors.

Unique Visitors

The number of people who visited your site. We are privacy-friendly so we don’t use cookies and other persistent identifiers. If a person visits from multiple devices or on multiple days, they are counted as separate visitors.

Views Per Visit

Views per visit (also known as Pages per session) shows the average number of pageviews per visit. Repeated views of a single page are included too.

Visit Duration

The amount of time visitors spend on your site. It only shows people who visit more than one page. For those who visit one page only we default to 0 seconds. Average visit duration is the sum of all session lengths divided by the number of sessions, which includes the 0 second visits (bounces).

Total Visits

A session (also known as a visit) is a set of actions that a user takes on your site. A visit is started when a visitor first lands on your page and ends when no action is taken on your site for 30 minutes.