Primer on Oracle Reputation in the Chainlink Network
A breakdown of our thoughts on oracle reputation on the Chainlink network and considerations we have been taking in the development of our soon to be released Relative Performance Ranking model whitepaper. - 2021-08-06
The reputation.link team has spent quite some time thinking about reputation on the Chainlink Network. With the launch of our platform rebuild, we have decided to share our current thoughts with the intention of getting more community stakeholders to consider the problem and an implementation of a reputation solution.
Oracle reputation is a critical component of the Chainlink ecosystem going forwards - as more users join the network and the Chainlink protocol takes less and less time to integrate, there needs to be a way to quickly determine which oracle networks or individuals oracles are more reliable and most cost effective for certain smart contract agreements. In the post below we dissect and speak more on some of the ideas that could shape Chainlink reputation, as well as some of the hurdles as development progresses.
- Table of content
- The Ideal Oracle
- On-Chain Reputation Metrics
- Attacking Reputation
- Answer Accuracy
- Off-chain Analysis for Reputation
- Reputation and Off-chain Aggregation
- Third Party Risk
- Specialisation of Oracles & Reputation
- Fraud Detection
- Data Provenance and Reputation
- Ending Note
The Ideal Oracle
An oracle’s reputation should represent the objective quality of its observed performance. One way of standardising reputation is in the form of a reputation score. The way in which a score represents an oracle’s performance may be considered a subjective measure and will therefore never be entirely correct, but its intended implementation is to give the best or closest answer to the quality of an oracle. To appropriately formulate what could form a reputation score, the qualities of an ideal oracle are outlayed and discussed below. The ideal oracle is:
- First to respond to requests
- Extremely low in its response time (in 1 block)
- Completing 100% of jobs
- Always guaranteeing service agreements with appropriate collateral
- Incurring zero penalty payments
- Running at 99.99% uptime
- Relaying the objective truth always
In achieving uptime and response time, should any critical component of the oracle crash or go down, the ideal oracle should implement a failover protocol that kicks in to ensure that the oracle remains online and responsive. For network robustness and decentralisation, the computational components and other hardware requirements should ideally be hosted in a local, private environment. This not only promotes decentralisation of oracle networks but also means there is no reliance on third party cloud providers that have absolute power when it comes to controlling infrastructure.
The ideal oracle should be honest in its answers by subscribing to API providers or, more ideally, have access to the primary data source. By being connected to the change of state in reality, whether that be through an IoT device, or through licencing rights to host a database that receives primary data regarding a particular data feed an oracle is as close to the source as possible. An oracle should be transparent in the ways in which it sources its data and be able to verify and provide proof of its source and any aggregation/computation done on that data.
The oracle should act as a trustless agent which prioritises the correct and intended execution of a smart contract that has been agreed upon by the parties involved. They should be impartial to bribes or rent-seeking that may change the outcome of data provided and contract executed. Or in other words the ideal oracle has no other incentive then to tell the truth. If one entity commands more than one oracle, they should provide that information so that there is awareness on the part of the job requester regarding the chances of a sybil attack undermining a service agreement.
On-Chain Reputation Metrics
Only fully verifiable data should be used to determine any form of reputation score.
Staking data is evidently the most intuitive way to grade the reputation of an oracle through the amount of value staked on a Chainlink node and its loss of stake. However it is also very important to consider other on-chain metrics that can attribute to a Reputation Score. Below we discuss:
- Response Time
- Jobs Completed
- Staking Performance
- Gas Price Used: 100 Gwei
- Duration between request and response: 1 Block, 13 Seconds
- Average percentile of response: Top 30%
When considering the way in which the response time of an oracle should affect a reputation score, it is assumed that the oracle that responds first should be considered the most performant oracle in that instance. With this logic response time can be measured and recorded by ordering oracles gas price and determining how many blocks it takes them to respond.
Consider the example where two oracles may submit their response on-chain at the exact same time. If both oracle’s answers are placed in the same block of transactions it should be the oracle that has used the highest gas price that earns the most reputation for that transaction.
Using gas price as a determination factor may allude to the idea that response time reputation can be ‘bought’. The most competitive, highest reputation oracle is most likely to be one that has the required resources to successfully manage their oracle. For the Chainlink network, fast response times are demanded for effective network volume scaling, if increasing the gas price decreases response time then this is a desired outcome.
Considering a future where there may be computationally difficult & time consuming job runs, where the response in block time may tend upwards of 15+ blocks it would be unfair for block response time to negatively affect reputation. To address this, percentile of response could be implemented in the future to determine reputation for certain job types. In the instance where an oracle submits an answer 20 blocks after a request but is the first out of many oracles to respond, placing them in the top percentile of respondents, they would achieve the highest reputation score attainable for that transaction. For such a process however, minimum required respondents would have to be taken into consideration.
Mempool transactions can be viewed and within them an oracle’s answer can be deduced unless there are privacy measures taken that obfuscate the transaction’s contained data. It would be unfair to justify response time based on when an oracle’s response is seen in a node’s mempool due to the different transaction ordering that nodes exhibit. As well as this, there may be some transactions that appear in the mempool but do not execute due to insufficient gas or otherwise.
Responses where answers are rejected by a job requester can still contribute to an oracle’s response time reputation. One reason for this is because there may be extremely selective and unique aggregation methods that are implemented - for example there may be a case where a requester is looking for the median oracle’s answer, only taking 1 answer out of 100 oracles. As well as this, intentionally responding without submitting the correct data as a means to reduce the response time is likely to become economically infeasible as penalty payments and staking is introduced.
- Job Completed: 100
- Completion Ratio: 100%
When measuring the job completion metric and considering how it affects the reputation of an oracle it is assumed that an oracle’s reputation should be determined by total jobs completed and the job completion ratio. An oracle that has accepted many jobs or service agreements, but has barely completed any of them, should not be able to maintain a high reputation. The successful completion of a job is the critical factor that job requesters are looking for. Therefore computing for a job completion ratio may be considered to be one of the more objective measures that can form a reputation score.
In terms of deciphering the value of each job completion, data such as LINK earned per job would be a worthwhile metric for job requesters to analyse. An operator may have completed 1000 jobs worth 1 LINK each but have only earned as much LINK as an operator that has completed 1 job worth 1000 LINK. LINK earned and LINK earned per request is an extremely easy metric to artificially game, therefore hesitancy is recommended when using it to produce a reputation score.
Service agreements where the use of a contract triggered by an external initiator need to be further considered as to how reputation may be affected. For example consider a service agreement that employs an oracle for the duration of an insurance contract that monitors the temperature of an IoT device. If the IoT device exceeds a threshold deviation of a certain temperature then the contract is executed. If the device does not record deviation then there will be no recorded job-run from that oracle during the service agreement. If there is no job-run recorded, the oracle that is responding will not receive any reputation for the task they may have just successfully completed.
- Total LINK staked 100,000 LINK
- Total LINK lost via slashing 0 LINK
The LINK earned via service agreements and lost through stakes is a very objective way to measure the performance of an oracle. One assumption that can be made when analysing this data is that the most reputable oracles in the network are those that have staked the highest amount of value against the supply of their data. And on the other hand, the least performant oracles are those that have lost a majority of their stake over time.
A reputation score is only valuable if it is sufficiently difficult to acquire a high score and if every stakeholder in the system can agree that it is a valuable measure of an oracle’s abilities. It is extremely important to contemplate the cases where reputation can be gamed, cheated or manipulated. Two attacks that were considered, outside of the realm of Sybil or Mirroring, are:
- The Retirement Attack
- The Long Con Attack
The Retirement Attack
Reputation death, or a Retirement Attack, is when an oracle forfeits their reputation by taking a bribe or simply by no longer wishing to participate in network activity. For the ecosystem to functionally survive and build up an immunity against bribery attacks, an oracle must value the long-term benefits of a reputation score more than they value the short term gain of a bribe. The Chainlink network is an extremely niche market, and any exit scam attack carried out by a pseudonymous oracle would not necessarily go towards punishing a node operator’s entity in the real world.
Currently Chainlink’s feeds are all ‘light feeds’ (as in the MakerDAO context) - oracles that have claimed their identities as companies or organisations. In the future it is likely that there could be oracles which have no identity outside of their on-chain Ethereum or blockchain address, in which case there is nothing to preserve externally in the physical world.
The Long Con Attack
Due to the decentralised open source nature of Ethereum & Chainlink it means that oracles have the opportunity to build their own reputation using their own contracts. To build one’s own reputation in such an environment may be considered economically infeasible, as there is no net gain made in LINK, only a net loss on ETH spent for gas and the resources used to maintain power to the oracle. Although the gas cost of gaining reputation may be able to be retrieved and even profited from if entering into what we describe as The Long Con Attack.
In a Long Con Attack, an ‘artificially’ reputable oracle can be used to attack high value contracts that are looking for high reputation oracles. After having chosen the oracle which has built its own reputation for the sole purpose of sabotage, the attacking oracle can report the wrong answer if they have the ability to gain from the contract through other means, lose their stake and then profit from the entire play.
With penalty deposits in place and by using large amounts of oracles, job requesters should be able to easily defend against such an attack - but such an attack can be foreseeable with a sophisticated attacker. It is interesting to consider that the ‘artificial’ reputation of an oracle may be comparable to the non-artificial reputation of other oracles as the on-chain analysis is entirely objective when it comes to recording response times, job completion rates and gas prices.
Answer accuracy can be derived from on-chain activity. The accuracy metric can take into account whether or not an oracle’s answer is accepted by the aggregation contract. When a group of oracles respond to a request their answers may then go through an aggregation phase. The result of the post aggregation answer may lead to an oracle’s answer being accepted or declined (i.e. they have provided a right or wrong answer). In binary cases this is a simple calculation. In one of many possible cases relating to price feeds, job requesters may decide that answers outside of two standard deviations from aggregated value are considered to be wrong. It is worth noting that there are unlimited options when it comes to methods of aggregating an answer.
A potential solution to creating consistency and deriving accuracy post aggregation is to construct standardised aggregation formats. For example there could be a ‘2DEV’ format that only accepts answers within 2 standard deviations of the mean value. This ‘2DEV’ format could be flagged to be used for the duration of a service agreement - only when this standard is flagged would the response accuracy be recorded by the system.
Off-chain Analysis for Reputation
From assessing all on-chain and off-chain data relating to oracles, service agreements and contracts - insights can be formulated and presented to the job-requester in an easily digestible form. By presenting supplementary data to what is already provided by the reputation score and basic metrics, it is possible to gain increased insight into an oracle’s past record.
Additional insight could be achieved by monitoring different unique requesters that an oracle serves. From this data, one draws conclusions about the variability and flexibility of an oracle when it comes to engaging in different service agreements for different contracts. For a job requester that is looking for a highly experienced oracle they may deem it unfavourable if an oracle has only served one prior contract even if they have an extremely high reputation.
Monitoring data such as the different service agreements that oracles have engaged in, as well as the data that is related to that service agreement would allow for highly job-specific analysis. As well as this, insights relating to revenue or losses per job could be gained through analysing LINK earned or lost through penalty deposits. Prior analysis can be conducted on the Ethereum testnet to discern whether or not an oracle has tested out their capabilities. The intention here would be to provide the job requester with as much data as possible to help them make the most informed decision on oracle selection that they can.
Sybil attacks may preemptively be discovered through identifying oracles that are responding in identical ways. There is the risk that oracles may be false-flagged as sybil attackers due to many being run on similar cloud servers. But oracles born at the same time, serving the same contracts, responding at the same time, using the same gas price could be flagged as suspicious. Another tracing element would be to analyse which Ethereum addresses have deposited ETH into a node to fund it - indicating high sybil attack risk.
Reputation and Off-Chain Aggregation
Off-chain aggregation (OCA) means that it is slightly more difficult to determine metrics other than on-chain staked amounts. Considering with OCA the aggregated answer may be the only one that is written on-chain - all oracles response times and whether they have submitted answers are not explicitly represented on-chain.
As outlined in the Chainlink whitepaper, in the off-chain portion of the aggregation when an oracle submits an answer it will generate a digital signature. This digital signature can then be sent to a validation contract that rewards oracles for submitting evidence of erroneous behaviour. For whether or not an oracle responds, it is proposed that this validation contract would accept an attestation from oracles which is a digitally signed set of responses that they receive from other oracles. Here again oracles would be rewarded for their submissions. By referencing this validation contract it is possible to acquire data relating to an oracle’s answer if they have responded to a request.
To address response time, if an oracle’s answer is digitally signed and timestamped, there could be a way to record the time an oracle submits their answer and then compare it to when a request was posted to generate a response time.
Third Party Risk
Going forward any reputation provider that relies on centralised components of their system in order to curate and generate a reputation score can be considered a third party security risk and therefore security hole. One solution to hedging against third party failure is to be generating and writing a reputation score on-chain. This would mean that the most recently written on-chain score can be referenced by other analytical front-ends, as well as being referenceable by service agreements. There are certainly also other measures that can be taken to ensure validity and provability of data in the realm of IPFS deployments and Truebit style computations.
Specialisation of Oracles and Reputation
As the Chainlink marketplace for data is further realised one possible scenario is that certain oracles will head down the path of specialisation. Whether it be performance specialisation (e.g. high processing, secure computations) or industry specialisation (becoming an oracle that serves all points of data that can be sourced in a weather market). Because of specialisation it is difficult to form a reputation score that can accurately provide the reputability of an oracle within a specific context - therefore it can be expected that multiple reputation providers will service multiple markets for different oracle use-cases.
For example, it is foreseeable for there to be demand for a reputation scorer that has specialised in analysing the data and framework of IoT devices. This generator may have created methods for observing and isolating anomalies that are found within these devices. This anomaly detection will be a key part in recognising fraudulent behaviour and responses. As well as this, the hardware itself can be scored by the reputation generator as to whether or not the device has tamper-proof measures, security features, etc. In specialisation it would be anticipated that markets for auditing and other services that provide device insurance or onsite security would emerge.
Consider a ‘wisdom of the crowd’ network of human oracles that vote upon the outcome of events with a device of their own. In an example of a horse race that isn’t recorded and doesn’t have race result data, a betting contract creator may demand 1000 human oracles to input the outcome of the race. Human Oracles may then engage in a service agreement where they are rewarded LINK for answer accuracy post-aggregation, or they may report and vote on the outcome of a race and be rewarded via a payout mechanism similar to that of the Schelling coin. From this human oracle use-case there could be a reputation provider that specialises in analysing the validity and trustworthiness of a voting device/person.
Fraud detection is a service that analyses and proves whether oracles are pulling their data from an original data provider. Fraud detection acts as a measure to ensure that oracles provide original answers, rather than engaging in freeloading or mirroring attacks and which undermine the value of a service agreement. At a high level, such a service can be achieved through data providers digitally signing their response upon an oracle calling their API. The timestamped digital signature can be given to the oracle that then relays the signature to a third party fraud detective. The fraud detective will then match the signature that the oracle has provided with the signature that the data provider originally generated.
Initial applications for fraud detection services would be most suited to the rapidly growing $7 bn USD TVL DeFi space. For these smart contracts that are highly dependent on the validity of data that they receive from oracles. It is important for these DeFi applications to have real-time updates as to whether or not oracles in a service agreement are cheating their answers so that they can remove an oracle from that service agreement or update it for increased security.
When implementing such a system, it is critical that oracles which are reported to have a false positive reading from the fraud detection system are not harshly penalised. It is imperative that the system design is transparent and as objective as possible. Oracles that are detected by a fraud detection service provider can be quickly blacklisted by other oracles and contract creators to prevent future attacks and siphon out parasitic oracles.
A simple fraud detection system like the one outlined above would be effective in demonstrating that an oracle is subscribed to a data provider. But, it is not effective in demonstrating whether or not the data an oracle uses is the same data provided to them - it must be cross-referenced with API results either through a verifiable computation or a trusted third party.
Data Provenance and Reputation
The APIs and price feeds used for certain DeFi contracts would contribute largely to the risk level of a contract when conducting some sort of risk analysis. It is therefore rational that the parties that engage in a smart contract will tend to select oracles and data providers that provide high levels of transparency, proof and provenance over the data source they use. It is anticipated that because of this demand driven selection, data providers and data sourcing devices will become more open-source and transparent about the methods in which they collect data and the fail-safes that are in place, in the event of attacks.
This necessity to minimise risk, even down to the device recording the data, opens up markets for reputation providers all along the value-chain of a smart contract. Reputation may be applied to Ethereum service providers, IoT device manufacturers and API providers. For example, in the case of an API provider, reputation could be applied to them by analysing metrics such as uptime, throughput and accuracy (relative to other industry-similar data providers).
Further proofs of data provenance can be achieved by using protocols such as DECO. DECO allows its users to prove the website that they obtained their data from, which could be stored in privately hosted databases. Risk can be minimised by using the triple handshake TLS protocol, as it provides a proof that the data came from a certain source - however, it does not prove whether data coming from the source has been manipulated or is ‘correct’.
Relative Performance Ranking
When deciding upon how to construct the model that will quantify reputation for the Chainlink network, our team has established several goals:
- Decentralise metric 'weightings'
- Reputation.link should not and will not solely hold the responsibility of deciding which metrics are valuable to the market or not.
- Encourage competition and a free market
- There should be elasticity in rankings, with oracles constantly trying to out-maneuver each other for marketshare.
- Encourage Specialisation
- Following from the second point, in the free market, various niches will develop. Our model will encourage and reward oracles who fulfil these niches, which will lead to oracles specialising.
- Scalable for n metrics
- We want the reputation calculation to evolve with the chainlink ecosystem. The valuable and niche metrics of today will remain 10 years from now. Our model must adapt dynamically as these changes happen in the ecosystem.
We believe that the model we've developed hits these goals, and we're looking forward to unveiling it in the next couple of months.
The team at reputation.link are very interested in discussing the topic of reputation and welcome an open dialogue in our Discord and through the Chainlink forum. Having the Chainlink ecosystem thinking about the problem and its applications will ensure that it is implemented correctly and appropriately - of great benefit to the system at large. Reputation.link is a secure and scalable data streamlining service and front-end visualisation tool that allows users to contextualise, visualise and analyse the Chainlink network. The platform illuminates historical and real-time performance of individual oracles, allowing developers to evaluate oracle reliability quickly and efficiently and in turn, understand the security guarantees of the data feeds that secure decentralised financial protocols. This allows smart contract developers and users to do their due diligence on Chainlink as a solution.