40 – Discussion about Komodo’s Oracles

The following articles explain how Komodo trustless Oracles work:

https://komodoplatform.com/tt2019-14-oracles-prediction-market-feeds/

https://komodoplatform.com/the-promise-of-smart-contracts-and-the-oracle-problem/

jl777 explains the Oracle in his own words:
https://medium.com/@jameslee777/decentralized-trustless-oracles-dto-by-piggybacking-on-timestamp-consensus-rules-2adce34d67b6

Zack Heiss (Amoveo developer and Oracles expert) opinion on jl777 method
https://github.com/zack-bitcoin/amoveo/blob/master/docs/other_blockchains/fast_oracles.md

They stirred some controversy but also the following interesting debate.

* * *

Timo Harings:
Hey there! I would like to talk about the Komodo’s Oracle. Is there someone from the dev team open for a discussion?

jl777:
Ok @Timo Harings what questions do you have?

Timo Harings:
On the article about Komodo’s Oracles vs. other Oracle projects, it says there is a flaw on how they are build. I find it hard comparing the Komodo Oracle to something like Chainlink. Do they both serve they same use cases? Based on the article from Komodo, it seems like it’s for price-feed only

jl777:
Chainlink uses data aggregation, which is the level that our CC-Oracles is for. To get a trustless Oracle, you need to do something like what we do and it would apply for anything that has an online data source. As far as use cases go, there is room for both trusted, aggregated and trustless Oracles and some things won’t be so easy to make a trustless Oracle for, due to lack of freely available online data source.

Timo Harings:
Yeah, I would get into the trustelss/decentralization part when comparing it to Chainlink. Aggregation is only one component for securing the network. The other two big parts are stake and reputation. Why do you think Chainlink is not secure enough/not trustless (enough)?

jl777:
Stake and reputation add detection and disincentives but don’t prevent an attacker. As I said Chainlink and CC-Oracles are sufficient in many cases but if the bet-size dwarves the value of the stake, then it is not a deterrent, is it? If the attacker is patient and builds up reputation over months or years, it doesn’t prevent that, does it? Things are on a spectrum, there are no absolutes.

Timo Harings:
How does it not prevent an attacker? Reputation raises the weight of the people acting honest and want to earn money. On the long term, raising your reputaiton brings you more ‘jobs’ and thus more income through the node. An attacker would have to ‘farm’ reputation for a long while to even consider attacking and maybe losing his stake.

jl777:
What if he is betting 10x or 100x the value of the stake/reputation? This model caps the entire at risk amount to the value of the stake, but there is no way to measure off-chain bets. So the stake/reputation systems work quite well, until it doesn’t.

Timo Harings:
“Betsize higher then stake”, how would that happen? If I (as a Dapp developer) could set the stake the nodes must have, for providing my Dapp with their data, I can have it always higher than the value behind it.

jl777:
“Offchain Betting”. Let’s say I make a bet with you about next years world cup, we happen to be both billionaries, so we bet a billion dollars sidebet.

Timo Harings:
It’s about on-chain for now. Off-chain is not a thing yet. Chainlink didn’t build into that direction yet. Main use case is on-chain.

jl777:
But you can’t stop off-chain, since it is off-chain, and off-chain bets can and will influence on-chain. You certainly can’t guarantee that nobody would ever make sidebets off-chain.

Timo Harings:
So when it’s about on-chain, you would say reputation and staking is more secure/secure enough?

jl777:
As I said, it will be sufficient, until it isn’t. You can’t stop off-chain sidebets. You can’t measure it, yet it creates an on-chain incentive.

Timo Harings:
If we bet with a million dollar on the outcome of a football match, I would want to have setup 10 Chainlink Oracles which all fetch data from 10 different match-result-Api.

jl777:
Who cares about a 10 million dollar stake, if you will win a 1 billion dollar bet?

Timo Harings:
How would you game that system with every node having at least 100.000 at stake.

jl777:
I have 100 million dollars to spend, you really think I couldn’t corrupt 6 of them?

Timo Harings:
Okay, you got a big bunch of money, go on…

jl777:
Either directly bribe the 10 oracles or participate in them with my army of high reputation/high stake accounts.

Timo Harings:
How do you contact them?

TonyL:
You can setup 10 Chainlinks from different sources Api, use it’s average price/odds as index.

jl777:
So that is the defense? The people running the oracles are hard to find? Security by obscurity is no security at all.

Timo Harings:
No, that’s just the first one.

TonyL:
And if one obviously lie, just not count it by some confidence interval

jl777:
It would likely be more practical to just participate in the Oracles directly, I have $100 million worth of stake/reputation to burn and still make $900 million. $100 million can create thousands of sybil accounts and infiltrate every Oracle. So using trusted Oracles would seem the only way, but now you are using trusted Oracles which, at scale, can be found and bribed. The 51% attack was just theoretical, until it wasn’t.

Timo Harings:
Don’t dodge it, let’s go step by step…

jl777:
Dodge what? Are you saying that if I have majority of stake, I can’t corrupt every Oracle? If I have an army of sybil accounts, I can’t have them all behave perfectly nicely until I want to corrupt some Oracle data? I was on the other side of this debate with Zack but he convinced me that data aggregation won’t work. Maybe you can make it less likely to be corrupted, but ultimately the off-chain sidebet totally blows it out.

Zack Hess (@zack_bitcoin)
This includes a great description of parasite contracts, and how they can destroy @AugurProject $rep and @Truthcoin
Amoveo’s $veo Oracle does not have any trading fees, so parasite contracts would be beneficial. They would not harm amoveo. https://t.co/osgPkZaeRK
https://twitter.com/zack_bitcoin/status/1117073767920996353

The Parasite and the Whale
https://medium.com/@edmundedgar/the-parasite-and-the-whale-7cb3c87e9902

Timo Harings:
I bet high money here, so I gonna make sure the data comes from a big Chainlink node operator. The will be further metrics where I can chose Oracles from. I’m not sure how familiar you are with Chainlink… You CAN have a metric saying “I want to use one node operator that has worked 50.000 jobs and had no penalty/bad data for x amount ot time”. I can also go and chose a node operator that is “certified by a third party”, let’s say a known data provider from the non-crypto space. This is gonna be a thing.
Do you think you can pay him enough that he throws away all his reputaiton, losing his revenue he worked for months/years? Let’s say you find out which 10 nodes are serving the smart contract and contact them. You would have to convince 50% of them to try and attack the network, sending false data. So either change the data after receiving it from the API, so inside the Oracle node, and then send it (which isn’t possible with TEE), or you would have to convince 50% of the actual data provider, the size of Bloomberg, which is true as I, who opens the bet with you choose 10 big fucking providers to deliver that data. And the 50% corrupt providers still have the risk in loosing all their stake.

jl777:
Ok, ignore the game theory if you want. I am just relaying the points made by experts who are deep into this for years.

Timo Harings:
If I have an army of sybil accounts, can’t I have them all behave perfectly nicely until I want to corrupt some Oracle data?
An army of sybil accounts doesn’t mean anything, they would be outweighted by a lot. You have for example someone with 100.000$ of Chainlink token. That is just one of ten. So you pay at least 1 million in acquiring the token for only having more stake. Then you have the reputation. If you just newly created these accounts, the reputation is pretty much zero. So no matter if you got 500.000$ of LINK tokens in total of these 5 nodes, they will get outweighted after the reputation get’s calculated in. I will answer any of your points, no worries.

jl777:
With sufficient funds, the adversary can become the “one node operator that has worked 50.000 jobs and had no penalty/bad data”. Adversary is patient, he works these high stake accounts for years and becomes a trusted part of the ecosystem.

Timo Harings:
No, I am smart and choose at least 9 others with quite similar statistics, not 9 others that have only 50% of that in total.

jl777:
And I have essentially limitless funds so I have 20 such, all with $100k+ each and great performance. The best performance. Everybody knows this and then one day… blam! The sleeper awakens and corrupts the big money bet.

Timo Harings:
Yes, then it could be possible. If you have worked on these for years, and invested in 1.000.000 LINK token and getting hundreds of thousands of jobs done, having a real high reputation… But then these guys make more in the long run with just further serving true data than with the one time you pay them 1 Million dollar.

jl777:
These are agents of the evil billionaire, don’t need to pay them. So there is indeed deterrence and detection, but not total protection from false data. That is the point. Maybe this is just fine and a practical trade-off, this all started when they said my CC-Oracles were no good. I tried to defend them but, in the end, aggregated data is aggregated data and no way to really protect against a determined, patient and well funded attacker.

Timo Harings:
“I was on the other side of this debate with Zack but he convinced me that data aggregation wont work”. Data aggregation alone not… I agree. But with these other components it does. It makes it just too hard to break. So hard that the profit is a lot bigger with staying on the true side, delivering what was asked.

jl777:
That is true until you bring in the off-chain bet. All assumptions that $100k disincentive is adequate, go out the window with a $100 million sidebet.

Timo Harings:
Off-chain is another thing that no Oracle provider has finalized yet. Also not Chainlink. So debating about that doesn’t make much sense to me.

jl777:
Well I did, trustless Oracles don’t allow bad data on-chain.

Timo Harings:
So how do you get the data? How is it fetched?

jl777:
No complex incentive/disincentive system.

Timo Harings:
So the Nodes acting by the rules without incentive?

jl777:
It is fetched via the internet. The incentive is to mine blocks.

Timo Harings:
Can I spin up a node? Or is it run by Komodo (Foundation)?

jl777:
You need to post valid data to mine a block.

Timo Harings:
How is it checked that the data is valid?

jl777:
There is no ‘Komodo foundation’.

Timo Harings:
Who runs the nodes then? Can I spin one up right now?

jl777:

./Komodod -ac_cbopret=7 -ac_prices=”LTC, BCHABC, XMR, IOTA, ZEC, WAVES, LSK, DCR, RVN, DASH, XEM, BTS, ICX, HOT, STEEM, ENJ, STRAT” -pubkey=<yourpubkey> -ac_name=REKT0 -ac_cclib=prices -ac_cc=10777 -ac_reward=3000000000 -ac_supply=120000000 -ac_pubkey=039433dc3749aece1bd568f374a45da3b0bc6856990d7da3cd175399577940a775 -ac_perc=77777 -ac_blocktime=600 -addnode=5.9.102.210 &

Yes, run that. It is still in testing phase but so far it is working well.
http://159.69.45.70:8050/

Timo Harings:
I can install the software, get connected to mainnet and run actual jobs for other people?

jl777:
It is a different model than Chainlink, it is a dedicated chain for a specific set of prices.

Timo Harings:
Yeah testnet is fine. Okay, how does my node get selected to serve someone with data?

jl777:
You are not understanding: ALL nodes get all data and validate all blocks, all blocks must have valid data to be accepted by the network, like bitcoin validates every tx from genesis.

Timo Harings:
What happens if 9 nodes say X and my nodes say Y and I try to finalize the block just like the others?

jl777:
There is a +/- 1% tolerance relative to local prices, if a block that comes in exceeds that, it is rejected. Very similar to how timestamps are handled. In fact, almost identical: if a timestamp from the future comes in, it is rejected (if > 60 seconds in Komodo).

Timo Harings:
Okay, so I can falsify the value by 0.99%? 9 nodes say price is 10.0$ and my node says it’s 10.099$. What happens at block finalization?

jl777:
If a price >1% away from local price comes in, it is rejected. Only one miner mines the block, right? It doesn’t matter what other nodes say, it is not a vote. It is a consensus rule. It is not aggregating.

Timo Harings:
Why doesn’t it matter? If I am the fastest to mine that block and put in my wrong data, who or what is stopping me from that?

jl777:
Nobody stops you, now you go onto your own fork. When the first honest miner mines a valid block the network will accept that one.

Timo Harings:
Yeah but my block is valid. I managed to finalize the block the fastest and can put data into it. How could someone else check that the data is false? If the other 9 nodes say something different, there is an instant fork? So it’s democratic? Like with hashpower, but just with plain data I put into the block?

jl777:
It isn’t valid if it fails the “are all the prices valid?” test. Try mining with your system clock set wrong, what happens?

Timo Harings:
Why would it fail that if I only have it false by 0.99%?

jl777:
Ok, 0.99% you are ok. There is a 2% attack bias by +1% then by -1%, but this is mitigated by 51% correlation, still possible if you sustain 51% attack for days and you get to bias prices by 1%

Timo Harings:
Okay, I would agree that this would be secure enough for a lot of use cases. But in no way more secure than Chainlink with aggregation+reputation+stake+TEE. It’s really clever binding the volatility to the blocks, limiting it, it is elegant. My first question I had after reading the article was, what is with other types of data than can’t fluctuate by 1%? Like weather data that changes a lot more or completely non-numeric ones?

Dukeleto:
Use logarithms

jl777:
Prices do change by more than 1%, in such a case there is an exception that allows a 1% change and in 10 blocks it is changed by 10%. So as long as things aren’t changing by 50% all the time, it will work, with a delay.

Timo Harings:
Yeah but now I need something that can mirror price changes by 50% to 1000% for example

jl777:
Like @Dukeleto said, use logarithms to reduce the dynamic range or you can change the 1% tolerance level to a different amount, but that will prevent detecting false data if it is too big.

Timo Harings:
Is there a way to use Komodo Oracles for non-price use cases?

jl777:
For non-numeric things, as long as it is just one valid answer, then all the nodes would require that specific answer.

Timo Harings:
But I would have to change the whole consensus rule of the Oracle chain. I still need the 1% rule for my price-data use cases.

jl777:
Yes, we are all about making custom consensus (CC) here. You can have dozens of different consensus rules, each applying to a specific type of tx.

gcharang:
And we can have as many secure chains as we want here in Komodo Platform with dPoW. A new chain for every application.

Timo Harings:
Yeah but the only thing making the price ‘true’ is the 1% rule. If we can’t apply it to non-numeric data, how can we now detect the true and false data? Only by democratic game theory? I could Sybil attack it then.

jl777:
You mean true/false oracle? That would be easiest as there is no variance that needs to be handled.

Timo Harings:
@gcharang Yeah I understand that one. Besides other cryptos, having weak hashpower or centralized stakes, Komodo is kinda linked to the Bitcoin chain and getting the security of that hashpower in some way. I don’t know how exactly it works but I heard about it. That’s cool.

jl777:
Rather than have a single chain do everything, we just spin up a dedicated chain for a dedicated task. With CC pretty much anything can be done. At least so far I have been able to do dozens of new consensus rules.

Timo Harings:
The decision making is easier, but you don’t have security against Sybil Attack here.

jl777:
Yes there is. It doesn’t matter how many miners you have or nodes, if the data is invalid, it is rejected like trying to use a timestamp from tomorrow.

Timo Harings:
I’m gonna use a chain that is for non-numeric data. And I see there are currently 50 nodes live. Now I’m gonna spin up 100 and feed wrong data into. What prevents me from that? Who says it’s invalid?

jl777:
Let’s change this to discuss timestamp to make it more concrete, ok? You are the attacker and you want to convince the network that your timestamp is the right one.

Timo Harings:
Timestamp is a different thing

jl777:
Why? It is data from outside world, each node has a local system clock.

Timo Harings:
Sorry, thought about block number not timestamp… lol

jl777:
So you want to convince the network it is actually one hour from now, ok? Try your attack. All the nodes are checking incoming timestamps to be no more than 1 minute from the future, relative to local system clock. Anything else is rejected, any node that does this too much is banned. You can have 1 million nodes all one hour ahead, attack! What do you submit to the network? Miner of invalid data doesn’t get to mine the block but if they persist they get banned and no node will interact with it. Since no attacker’s block, an honest miner mined a valid timestamp.
To attack this you need to change the local system clocks on all the nodes. It is so simple, anybody can tell it is resistant to bad data. There is no complex interaction of reputation, rewards, punishments, frontrunning, freeloading, all that complexity goes out the window replaced by each node comparing all data with their local version.
There are timewarp attacks, which involve biasing the timestamp forward and backward within allowed limits to game the mining difficulty, and similarily the data can be biased +/-1% here, which is an issue and maybe the tolerance should be lowered to 0.5%, but that slows down the adapting to the real price changes of 10% or more and maybe we get more local nodes rejecting what is valid.
It is literally 500 lines of code I wrote in a weekend. I did think about all the rewards/incentives systems and my conclusion was it was very complex to get it perfectly right and Zack would just say “off-chain bets”. If you can understand about timestamps in blocks, you are understanding most of the trustless Oracles. It takes advantage of the identical mechanism, I just piggybacked oracledata onto it.
I haven’t implemented all possible Oracles, that is not my goal. With this as an example, any decent developer can make their own trustless Oracle for their data source. I prefer things that are working and allow others to make improvements and variations of it.

FishyGuts:
LINK tokens are used to pay data providers, Chainlink node operators, payment providers, and other online service providers for their services. Smart contract users will compensate the data providers that they use with LINK tokens.
How’s this working out? Curious.

jl777:
Don’t worry, Komodo won’t obsolete LINK but maybe the hundreds of trustless Oracles chains made by others will put a dent.

FishyGuts:
True. Komodo will probably benefit LINK in the future.

Timo Harings:
Sorry, I don’t have much time right now but I would be happy to continue tomorrow maybe. I really just wanted a healthy debate and I am really happy to had the discussion above. Sorry for the abrupt end. Thanks for you time guys!

FishyGuts:
@Timo Harings, James will be glad to have discussions on these topics anytime.

jl777:
Eager to hear how you will corrupt the timestamp an hour ahead.

Timo Harings:
Yeah, let’s continue where we stopped later. Cheers šŸ™‚

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s