vince
API first high performance self hosted and cost effective privacy friendly web analytics server for organizations of any size
What is vince ?
Vince is a modern server for collecting and analyzing website analytics. Vince
focuses on modern web application development by emphasizing easy of use for both deployment, maintenance and integration with existing infrastructure.
It ships with a standalone binary with zero dependency called vince
Features
Extremely fast relative to competitors. Uses apache
arrow
for fast vectorized in memory computation. It is designed from grounds up, and highly optimized for web analytics use case.Zero Dependency: Ships a single binary with everything in it. No runtime dependency.
High events ingestion rate : Non blocking ingestion, you can deploy for very popular sites without worrying.
Fast query api : Instant results for active and historical data.
Easy to operate: One line commandline flags with env variables is all you need.
Works with any language and tooling: No need for special sdk, a simple
http
api
is exposed. Anything that can speakhttp
can work withvince
10X more data storage : We use columnar storage with extensive compression schemes. Don’t worry about running out of disk. Store and query large volume of data.
Unlimited sites: There is no limit on how many sites you can manage.
Privacy friendly: No cookies and fully compliant with GDPR, CCPA and PECR.
Getting started
Installation
Installation script
curl -fsSL https://vinceanalytics.com/install.sh | bash
Homebrew
brew install vinceanalytics/tap/vince
Container image
docker pull ghcr.io/vinceanalytics/vince
Starting server
vince --data=vince-data --domains=example.com --nodeId=1
This will start vince server listening on port 8080
Check if your server is up and running
$ curl http://localhost:8080/version
{
"version": "v0.0.62"
}
AddScript to your website
To integrate your website with Vince Analytics, you need to be able to update the HTML code of the website you want to track. Paste your Vince Analytics tracking script code into the Header (<head>
) section of your site. Place the tracking script within the <head> … </head>
tags.
Your Vince Analytics tracking script code will look something like this.
note:
data-api
is the url to wherevince
instance is listening and accept events, the events path is/api/event
data-domain
is the site_id/domain name you want to monitor for events
Development script on localhost
<script data-domain="example.com" data-api="http://localhost:8080/api/event">
(function(){"use strict";var t,n,o,a,l,r=window.location,e=window.document,c=e.currentScript,h=c.getAttribute("data-api")||d(c);function u(e){console.warn("Ignoring Event: "+e)}function d(e){return new URL(e.src).origin+"/api/event"}function i(t,n){try{if(window.localStorage.vince_ignore==="true")return u("localStorage flag")}catch{}var o,s={};s.n=t,s.u=r.href,s.d=c.getAttribute("data-domain"),s.r=e.referrer||null,s.w=window.innerWidth,n&&n.meta&&(s.m=JSON.stringify(n.meta)),n&&n.props&&(s.p=n.props),o=new XMLHttpRequest,o.open("POST",h,!0),o.setRequestHeader("Content-Type","text/plain"),o.send(JSON.stringify(s)),o.onreadystatechange=function(){o.readyState===4&&n&&n.callback&&n.callback()}}a=window.vince&&window.vince.q||[],window.vince=i;for(t=0;t<a.length;t++)i.apply(this,a[t]);function s(){if(o===r.pathname)return;o=r.pathname,i("pageview")}n=window.history,n.pushState&&(l=n.pushState,n.pushState=function(){l.apply(this,arguments),s()},window.addEventListener("popstate",s));function m(){!o&&e.visibilityState==="visible"&&s()}e.visibilityState==="prerender"?e.addEventListener("visibilitychange",m):s()})()
</script>
Production script
<script data-domain="vinceanalytics.com" data-api="http://api.vinceanalytics.com/api/event">
(function(){"use strict";var n,s,i,r,d,t=window.location,e=window.document,c=e.currentScript,h=c.getAttribute("data-api")||u(c);function l(e){console.warn("Ignoring Event: "+e)}function u(e){return new URL(e.src).origin+"/api/event"}function a(n,s){if(/^localhost$|^127(\.[0-9]+){0,2}\.[0-9]+$|^\[::1?\]$/.test(t.hostname)||t.protocol==="file:")return l("localhost");if(window._phantom||window.__nightmare||window.navigator.webdriver||window.Cypress)return;try{if(window.localStorage.vince_ignore==="true")return l("localStorage flag")}catch{}var i,o={};o.n=n,o.u=t.href,o.d=c.getAttribute("data-domain"),o.r=e.referrer||null,o.w=window.innerWidth,s&&s.meta&&(o.m=JSON.stringify(s.meta)),s&&s.props&&(o.p=s.props),i=new XMLHttpRequest,i.open("POST",h,!0),i.setRequestHeader("Content-Type","text/plain"),i.send(JSON.stringify(o)),i.onreadystatechange=function(){i.readyState===4&&s&&s.callback&&s.callback()}}r=window.vince&&window.vince.q||[],window.vince=a;for(n=0;n<r.length;n++)a.apply(this,r[n]);function o(){if(i===t.pathname)return;i=t.pathname,a("pageview")}s=window.history,s.pushState&&(d=s.pushState,s.pushState=function(){d.apply(this,arguments),o()},window.addEventListener("popstate",o));function m(){!i&&e.visibilityState==="visible"&&o()}e.visibilityState==="prerender"?e.addEventListener("visibilitychange",m):o()})()
</script>
After adding the script no further configuration on the website is needed. When a user visits the website events will be sent to your Vince Analytics instance.
Configuration
vince
has three way of passing configuration, commandline flags, environment
variables and configuration file.
All three ways can be combined to form a secure deployments. The level of precedence follows
cli ->
env ->
file. So if lets say listen
is provided by all ways, then
the value set in file
will be used.
We recommend using commandline flags and environment variables. Anything that can be expressed in configuration file can also be expressed with commandline flags and environment variables.
Data
Path to directory where vince
will store persisting data. This option is required for vince
to be operational.
- env
VINCE_DATA
exampleVINCE_DATA=path/to/storage
- flag
--data
example--data=path/to/storage
- file
data
{"data":"path/to/storage"}
Listen
HTTP host:port
that vince server will listen for http requests
- env
VINCE_LISTEN
exampleVINCE_LISTEN=:8080
- flag
--listen
example--listen=:8080
- file
data
{"listen":":8080"}
Rate Limit
A float representing requests/second
. This is applied on /api/event
and /api/v1/event
protect them against excessive incoming requests.
By default there is no limit.
Limits are applied globally across all sites. We don’t support per site limits yet, but it is something planned for the future releases
- env
VINCE_RATE_LIMIT
exampleVINCE_RATE_LIMIT=1.7976931348623157e+308
- flag
--rateLimit
example--rateLimit=1.7976931348623157e+308
- file
rateLimit
{"rateLimit":1.7976931348623157e+308}
Granule Size
Size in bytes for indexes+parts that when reached they are compacted and stored in disk.
By default 16MB
is used. You need to adjust this depending on your deployment resources.
Unless otherwise
16MB
is a very conservative balance. This option is for power users. You can have production deployments sticking with the defaults.
When in memory index+arrow.Record
reaches this size, we convert the arrow
record to parquet file with compression enabled, then the resulting file is stored in durable key/value store with configured retention period.
So, While 16MB
is in memory, what actually goes to disk is smaller than this value. We also compress the index before storing it as well.
- env
VINCE_GRANULE_SIZE
exampleVINCE_GRANULE_SIZE=16777216
- flag
--granuleSize
example--granuleSize=16777216
- file
granuleSize
{"granuleSize":"16777216"}
Geo IP DB
Path to geo ip database. We use this to obtain city
, country
and region
information of an event.
This is optional.
- env
VINCE_GEOIP_DB
exampleVINCE_GEOIP_DB=path/to/geoip_db
- flag
--geoipDbPath
example--geoipDbPath=path/to/geoip_db
- file
geoipDbPath
{"geoipDbPath":"path/to/geoip_db"}
Domains
A list of domains managed by vince
instance. To send events to vince
there is no extra configuration on the client side.
Events submitted that have no registered domain are rejected. If you no longer manage the site simply removing it on this list will stop accepting events from it.
Domain is the hostname. For example
https://vinceanalytics.com
has domainvinceanalytics.com
- env
VINCE_DOMAINS
exampleVINCE_DOMAINS=vinceanalytics.com,example.com
- flag
--domains
example--domains=vinceanalytics.com,example.com
- file
domains
{"domains":["vinceanalytics.com","example.com"]}
Configuration file path
Path to configuration file. When provided it will be read and merged with the rest of configurations. Values set in this file takes precedence.
- env
VINCE_CONFIG
exampleVINCE_CONFIG=path/to/config.json
- flag
--config
example--config=path/to/config.json
Log level
How much will be logged on stdout
Default is INFO
. Values are INFO
,DEBUG
, WARN
and ERROR
.
- env
VINCE_LOG_LEVEL
exampleVINCE_LOG_LEVEL=INFO
- flag
--logLevel
example--logLevel=INFO
Retention Period
How long data will stay in permanent storage. Older data will automatically be deleted and the space reclaimed.
Default is 30
days.
- env
VINCE_RETENTION_PERIOD
exampleVINCE_RETENTION_PERIOD=720h0m0s
- flag
--retentionPeriod
example--retentionPeriod=720h0m0s
- file
retentionPeriod
{"retentionPeriod":"2592000s"}
Automatic TLS
vince
supports automatic tls using acme client with let’s encrypt.
Enabling automatic tls
- env
VINCE_AUTO_TLS
exampleVINCE_AUTO_TLS=true
- flag
--autoTls
example--autoTls=true
- file
autoTls
{"autoTls":"true"}
You need to setup account email address and the domain to generate certificate for, using acmeEmail
and acmeDomain
options.
acmeEmail
This is used by CAs, such as Let’s Encrypt, to notify about problems with issued certificates.
- env
VINCE_ACME_EMAIL
exampleVINCE_ACME_EMAIL=example@example.org
- flag
--acmeEmail
example--acmeEmail=example@example.org
acmeDomain
acmeDomain
should be the domain name that is used to point to your server. For example we host vince instance oncloud.vinceanalytics.com
so we use this asacmeDomain
- env
VINCE_ACME_DOMAIN
exampleVINCE_ACME_DOMAIN=example.org
- flag
--acmeDomain
example--acmeDomain=example.org
file
{"acme":{"email":"example@example.org","domain":"example.org"}}
Authorization
vince
supports bearer token authorization via authToken
option. All endpoints except /api/event
will be protected.
authToken
When set, clients calls without this bearer token will be rejected.
This is sensitive info use env var to set it.
- env
VINCE_AUTH_TOKEN
exampleVINCE_AUTH_TOKEN=xxx
- flag
--authToken
example--authToken=xxx
Stats API
CC-BY-S-4.0 This section was initially copied from Plausible Analytics docs
Thevince
API offers a way to retrieve your stats programmatically. It’s a read-only interface to present historical and real-time stats only. Take a look at our events API if you want to send pageviews or custom events.
The API accepts GET requests with query parameters and returns standard HTTP responses along with a JSON-encoded body. All API requests must be made over HTTPS. Calls made over plain HTTP will fail. API requests without authentication will also fail.
Each request must be authenticated with an authToken using the Bearer Token method.
Concepts
Querying the vince
API will feel familiar if you have used time-series databases before. You can’t query individual records from
our stats database. You can only request aggregated metrics over a certain time period.
Each request requires a site_id
parameter which is the domain of your site as configured in domains.
Metrics
You can specify a metrics
option in the query, to choose the metrics for each instance returned. See here for a full overview of metrics and their definitions. The metrics currently supported in Stats API are:
Metric | Description |
---|---|
visitors | The number of unique visitors. |
visits | The number of visits/sessions |
pageviews | The number of pageview events |
views_per_visit | The number of pageviews divided by the number of visits. Returns a floating point number. currently only supported in Aggregate and Timeseries endpoints. |
bounce_rate | Bounce rate percentage |
visit_duration | Visit duration in seconds |
events | The number of events (pageviews + custom events) |
Time periods
The options are identical for each endpoint that supports configurable time periods. Each period is relative to a date
parameter. The date should follow the standard ISO-8601 format. When not specified, the date
field defaults to today(site.timezone)
.
All time calculations on our backend are done in the time zone that the site is configured in.
12mo,6mo
- Last n calendar months relative todate
.month
- The calendar month thatdate
falls into.30d,7d
- Last n days relative todate
.day
- Stats for the full day specified indate
.custom
- Provide a custom range in thedate
parameter.
When using a custom range, the date
parameter expects two ISO-8601 formatted dates joined with a comma as follows ?period=custom&date=2021-01-01,2021-01-31
.
Stats will be returned for the whole date range inclusive of the start and end dates.
Properties
Each pageview and custom event in our database has some predefined properties associated with it. In other analytics tools, these are often referred to as dimensions as well. Properties can be used for filtering and breaking down your stats to drill into more depth. Here’s the full list of properties we collect automatically:
Property | Example | Description |
---|---|---|
page | /blog/remove-google-analytics | Pathname of the page where the event is triggered. You can also use an asterisk to group multiple pages (/blog* ) |
entry_page | /home | Page on which the visit session started (landing page). |
exit_page | /home | Page on which the visit session ended (last page viewed). |
source | Visit source, populated from an url query parameter tag (utm_source , source or ref ) or the Referer HTTP header. | |
referrer | t.co/fzWTE9OTPt | Raw Referer header without http:// , http:// or www. . |
utm_medium | social | Raw value of the utm_medium query param on the entry page. |
utm_source | Raw value of the utm_source query param on the entry page. | |
utm_campaign | profile | Raw value of the utm_campaign query param on the entry page. |
utm_content | banner | Raw value of the utm_content query param on the entry page. |
utm_term | keyword | Raw value of the utm_term query param on the entry page. |
device | Desktop | Device type. Possible values are Desktop , Laptop , Tablet and Mobile . |
browser | Chrome | Name of the browser vendor. Most popular ones are Chrome , Safari and Firefox . |
browser_version | 88.0.4324.146 | Version number of the browser used by the visitor. |
os | Mac | Name of the operating system. Most popular ones are Mac , Windows , iOS and Android . Linux distributions are reported separately. |
os_version | 10.6 | Version number of the operating system used by the visitor. |
country | United Kingdom | Country of the visitor country. |
region | England | Region the visitor region. |
city | London | City of the visitor city. |
Filtering
Most endpoints support a filters
query parameter to drill down into your data. You can filter by all properties described in the Properties table, using the following operators:
Operator | Usage example | Explanation |
---|---|---|
== | name==Signup | Simple equality - custom event “Signup” |
!= | country!=Tanzania | Simple inequality - country is not Tanzania |
~= | page~=^/blog/.*? | Regex - matches a regular expression |
!~ | page!~^/blog/.*? | Regex - matches not a regular expression |
Endpoints
GET /api/v1/stats/realtime/visitors
This endpoint returns the number of current visitors on your site. A current visitor is defined as a visitor who triggered a pageview on your site in the last 5 minutes.
REQUEST
curl "http://localhost:8080/api/v1/stats/realtime/visitors?site_id=$SITE_ID" \
-H "Authorization: Bearer ${TOKEN}"
RESPONSE
{
"visitors": "6"
}
Parameters
site_id REQUIRED Domain of your site on This endpoint aggregates metrics over a certain time period.It include site_id REQUIRED Domain of your site on period optional See time periods. If not specified, it will default to metrics optional List of metrics to aggregate. Valid options are filters optional See filtering This endpoint provides timeseries data over a certain time period. site_id REQUIRED Domain of your site on period optional See time periods. If not specified, it will default to filters optional See filtering metrics optional Comma-separated list of metrics to show for each time bucket. Valid options are interval optional Choose your reporting interval. Valid options are This endpoint allows you to break down your stats by some property. If you are familiar with SQL family databases, this endpoint corresponds to
running Check out the properties section for a reference of all the properties you can use in this query. This endpoint can be used to fetch data for site_id REQUIRED Domain of your site on property REQUIRED Which property to break down the stats by. Valid options are listed in the properties section above. period optional See time periods. If not specified, it will default to metrics optional Comma-separated list of metrics to show for each item in breakdown. Valid options are limit optional Limit the number of results. Maximum value is 1000. Defaults to 100. If you want to get more than 1000 results, you can make multiple requests and paginate the results by specifying the page optional Number of the page, used to paginate results. Importantly, the page numbers start from 1 not 0. filters optional See filteringvince
.GET /api/v1/stats/aggregate
Unique Visitors
Pageviews
, Bounce rate
and Visit duration
. You can retrieve any number and combination of these metrics in one request.REQUEST
curl "http://localhost:8080/api/v1/stats/aggregate?site_id=$SITE_ID&period=6mo&metrics=visitors,pageviews,bounce_rate,visit_duration" \
-H "Authorization: Bearer ${TOKEN}"
RESPONSE
{
"results": {
"bounce_rate": 1,
"pageviews": 40,
"visit_duration": 0.00834375,
"visitors": 26
}
}
Parameters
vince
.30d
.visitors
, visits
, pageviews
, views_per_visit
, bounce_rate
, visit_duration
and events
. If not specified, it will default to visitors
.GET /api/v1/stats/timeseries
REQUEST
curl "http://localhost:8080/api/v1/stats/timeseries?site_id=$SITE_ID&period=6mo" \
-H "Authorization: Bearer ${TOKEN}"
RESPONSE
{
"results": [
{
"timestamp": "2024-03-04T00:00:00Z",
"values": {
"visitors": 26
}
}
]
}
Parameters
vince
.30d
.visitors
, visits
, pageviews
, views_per_visit
, bounce_rate
, visit_duration
and events
. If not
specified, it will default to visitors
.date
(always) and month
(when specified period is longer than one calendar month). Defaults to
month
for 6mo
and 12mo
, otherwise falls back to date
.GET /api/v1/stats/breakdown
GROUP BY
on a certain property in your stats, then ordering by the count.Top sources
, Top pages
, Top countries
and similar reports.REQUEST
curl "http://localhost:8080/api/v1/stats/breakdown?site_id=$SITE_ID&period=6mo&property=source&metrics=visitors,bounce_rate&limit=5" \
-H "Authorization: Bearer ${TOKEN}"
RESPONSE
{
"results": [
{
"property": "source",
"values": [
{
"key": "Ask Toolbar",
"value": {
"bounce_rate": 0,
"visitors": 1
}
},
{
"key": "Softonic",
"value": {
"bounce_rate": 1,
"visitors": 2
}
},
{
"key": "Seznam",
"value": {
"bounce_rate": 1,
"visitors": 1
}
},
{
"key": "Google News",
"value": {
"bounce_rate": 1,
"visitors": 7
}
},
{
"key": "Google Blogsearch",
"value": {
"bounce_rate": 1,
"visitors": 7
}
},
{
"key": "Tiscali",
"value": {
"bounce_rate": 1,
"visitors": 2
}
},
{
"key": "Alexa",
"value": {
"bounce_rate": 1,
"visitors": 2
}
},
{
"key": "Google",
"value": {
"bounce_rate": 1,
"visitors": 8
}
}
]
}
]
}
Parameters
vince
.30d
.visitors
, pageviews
, bounce_rate
, visit_duration
, visits
and events
. If not
specified, it will default to visitors
.page
parameter (e.g. make the same request with page=1
, then page=2
, etc)
Events API
The Vince Analytics Events API can be used to record pageviews and custom events. This is useful when tracking Android or iOS mobile apps, or for server side tracking.
In most cases we recommend installing Vince through provided script
Unique visitor tracking
Special care should be taken with two key headers which are used for unique visitor counting
- The User-Agent header
- The X-Forwarded-For header
If these headers are not sent exactly as required, unique visitor counting will not work as intended. Please refer to the Request headers section below for more in-depth documentation on each header separately.
Endpoints
POST /api/event
Records a pageview or custom event. When using this endpoint, it’s crucial to send the HTTP headers correctly, since these are used for unique user counting.
curl -i -X POST http://localhost:8080/api/event \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 OPR/71.0.3770.284' \
-H 'X-Forwarded-For: 127.0.0.1' \
-H 'Content-Type: application/json' \
--data '{"name":"pageview","url":"http://example.com","domain":"example.com"}'
Parameters
domain REQUIRED Domain name of the site in Vince name REQUIRED Name of the event. Can specify url REQUIRED URL of the page where the event was triggered. If the URL contains UTM parameters, they will be extracted and stored. When using the script, this is set to The maximum size of the URL, excluding the domain and the query string, is 2,000 characters. Additionally, URLs using the data URI scheme are not supported by the API. referrer OPTIONAL Referrer for this event. When using the standard tracker script, this is set to Referrer values are processed heavily for better usability. Consider referrer
URLS like Vince uses the open source referer-parser database to parse referrers and assign these source categories. User-Agent REQUIRED The raw value of User-Agent is used to calculate the user_id which identifies a unique visitor
in Vince. User-Agent is also used to populate the Devices properties in vince. The device data is derived from the open source database device-detector. If your User-Agent is not showing up in your dashboard, it’s probably because it is not recognized as one in the device-detector database. The header is required but bear in mind that browsers and some HTTP libraries automatically add a default User-Agent header to HTTP requests. In case of browsers, we would not recommend overriding the header manually unless you have a specific reason to. X-Forwarded-For optional Used to explicitly set the IP address of the client. If not set, the remote IP of the sender will automatically be used. Depending on your use-case:
1. If sending the event from your visitors’ device, this header does not need to be set
2. If sending the event from a backend server or proxy, make sure to override this header with the correct IP address of the client. The raw value of the IP address is not stored in our database. The IP address is used to calculate the user_id which identifies a unique visitor in Vince. It is also used to fill the Location properties with country, region and city data of the visitor. If the header contains a comma-separated list (as it should if the request is sent through a chain of proxies), then the first valid IP address from the list is used. Both IPv4 and IPv6 addresses are supported. More information about the header format can be found on MDN docs. Content-Type REQUIRED Must be either application/json or text/plain. In case of text/plain, the request body is still interpreted as JSON.pageview
which is a special type of event in Vince. All other names will be treated as custom events.window.location.href
.document.referrer
m.facebook.com/some-path
and facebook.com/some-other-path
. It’s intuitive to think of both of these as coming from a single source: Facebook. In the first example the referrer
value would be split into source == Facebook
and referrer == m.facebook.com/some-path
.Request headers
All Metrics
CC-BY-S-4.0 This section was initially copied from Plausible Analytics docs
Bounce Rate
The percentage of visitors with a single page view. A visitor “bounces” away and leaves your site after only viewing a single page.
Current Visitors
The number of people currently on your site.It includes all visitors who have loaded a page in the last 5 minutes..
Time on Page
The average time people spend on a particular page on your site. This is calculated as the difference between the point when a person lands on a particular page and when they move on to the next page.
Total Pageviews
The total number of times your pages were loaded by your visitors.
Unique Visitors
The number of people who visited your site. We are privacy-friendly so we don’t use cookies and other persistent identifiers. If a person visits from multiple devices or on multiple days, they are counted as separate visitors.
Views Per Visit
Views per visit (also known as Pages per session) shows the average number of pageviews per visit. Repeated views of a single page are included too.
Visit Duration
The amount of time visitors spend on your site. It only shows people who visit more than one page. For those who visit one page only we default to 0 seconds. Average visit duration is the sum of all session lengths divided by the number of sessions, which includes the 0 second visits (bounces).
Total Visits
A session (also known as a visit
) is a set of actions that a user takes on your site. A visit is started when a visitor first lands on your page and ends when no action is taken on your site for 30 minutes.