Introduction to CKB Script Programming 6: Type ID
You might noticed there's a hash_type
field in CKB's Script structure. For simplicity, we've been ignoring that till now. This post will provide an introduction to the hash_type
field, and the unique capability it can bring.
In the blockchain space, there's a pair of issues that everyone has to pick sides:
Upgradability
Can I upgrade a smart contract after it's deployed on the blockchain? Suppose a smart contract gets widely adopted, then all of a sudden, someone notices a bug in the smart contract(sadly this will always happen in our industry), can we upgrade the smart contract to a fixed version without affecting all the users?
A different scenario: as technology advances, there might be new algorithms that could be leveraged to build smart contracts that run faster. Can we upgrade existing smart contracts to benefit more users?
Determinism
This one has 2 parts:
- Determinism A: if I pick a smart contract to guard my tokens, will my tokens stay safe(could be unlocked by me and only by me) in the future?
- Determinism B: if I sign a transaction now, and send it later, will my transaction still be accepted by the blockchain?
Note that a secure blockchain has more deterministic requirements than those mentioned here. I'm only including properties that relate to the discussed problem here.
Conflicts
If we think about it, we could find that there're always conflicts between upgradability and determinism:
- If a smart contract could be upgraded, it might have different behaviors, hence enabling an attacker to unlock a cell, or disabling unlocking from the owner himself/herself.
- If a smart contract could be upgraded, an already signed transaction might run into different behaviors, resulting it be rejected by the blockchain.
Historically, there is only one side you can pick, and most existing blockchains have picked the side of determinism. The "code is law" idea thus becomes very famous in the blockchain space. But we all know software design is all about tradeoffs. Given certain situation, it might make sense to sacrifice slight determinism, in exchange for the ease of upgradability. For example, a large organization might have special security team monitoring potential vulnerabilities in smart contracts use by them. By granting them the ability to upgrade smart contracts, they have their own control in fixing vulnerabilities instead of waiting on someone. In the meantime, determinism property A won't be an issue for them since they are in charge of their own cells and smart contracts.
So now the question is: can we enable this new possibility in CKB? It turns out we can and this whole mechanism is already deployed in current mainnet CKB! But to understand the whole mechanism in CKB, we have to first take a look at some other ideas that support this mechanism.
NOTE: I'm sure you will ask: does this mean you will sacrifice determinism in CKB? Let me assure you this feature is totally optional in CKB, you can perfectly practice "code is law" principle in CKB like in other blockchains. We are just hoping this unique feature will provide new possibilities for people who really need it.
Write A Unique Type Script
Let's first ask a question: how can we create a type script that ensures only one live cell in CKB can have that unique type script? By unique type script, we mean the whole type script structure, including code hash, hash type and args.
Note that this question might seem irrelevant to script upgradability right now, but please bear with me, we will see how it will contribute to the final solution in CKB.
Note if you have seen my previous UDT article, you might already realize there is a solution. But if you haven't, I do suggest to take a step back and think about how you can implement such a script on your own. It will be a really good learning experience.
If you are ready, here's the basic workflow of the script:
- Count how many output cells use current type script, if there's more than one cell with current type script, returns a failure return code;
- Count how many input cells use current type script, if there's one input cell with current type script, returns a success return code;
- Use CKB syscall to read the first OutPoint in current transaction;
- If the OutPoint data read match the
args
part of current type script, returns a success return code; - Returns a failure return code otherwise.
Putting in simple C code, the script would look like following:
#include "blockchain.h"
#include "ckb_syscalls.h"
#define INPUT_SIZE 128
#define SCRIPT_SIZE 32768
int main() {
uint64_t len = 0;
int ret = ckb_load_cell(NULL, &len, 0, 1, CKB_SOURCE_GROUP_OUTPUT);
if (ret != CKB_INDEX_OUT_OF_BOUND) {
return -1; /* 1 */
}
len = 0;
ret = ckb_load_cell(NULL, &len, 0, 0, CKB_SOURCE_GROUP_INPUT);
if (ret != CKB_INDEX_OUT_OF_BOUND) {
return 0; /* 2 */
}
/* 3 */
unsigned char input[INPUT_SIZE];
uint64_t input_len = INPUT_SIZE;
ret = ckb_load_input(input, &input_len, 0, 0, CKB_SOURCE_INPUT);
if (ret != CKB_SUCCESS) {
return ret;
}
if (input_len > INPUT_SIZE) {
return -100;
}
unsigned char script[SCRIPT_SIZE];
len = SCRIPT_SIZE;
ret = ckb_load_script(script, &len, 0);
if (ret != CKB_SUCCESS) {
return ret;
}
if (len > SCRIPT_SIZE) {
return -101;
}
mol_seg_t script_seg;
script_seg.ptr = (uint8_t *)script;
script_seg.size = len;
if (MolReader_Script_verify(&script_seg, false) != MOL_OK) {
return -102;
}
mol_seg_t args_seg = MolReader_Script_get_args(&script_seg);
mol_seg_t args_bytes_seg = MolReader_Bytes_raw_bytes(&args_seg);
if ((input_len == args_bytes_seg.size) &&
(memcmp(args_bytes_seg.ptr, input, input_len) == 0)) {
/* 4 */
return 0;
}
return -103;
}
As explained in the UDT post, attackers will be prevented in several different directions:
- If an attacker tries to create a cell with the exactly same type script, there will be 2 cases: a. A valid transaction will have different OutPoint data in the first input from the type script args; b. If the user tries to duplicate type script args as the first transaction input, CKB will signal a double-spent error;
- If the attacker tries to use a different type script args, it will be a different type script by definition.
This way, we can ensure a cell will have unique type script across all live cells in CKB. Considering each script has an associated hash, we will have a cell in CKB with its unique hash, or unique ID.
Resolving Scripts in CKB Transaction
Now let's look at how CKB resolves the script to run before the hash type
change:
- CKB extracts
code hash
value from the script structure to run. - It loops through all dep cells, computes the hash of each dep cell data. If any dep cell data hash matches the specified
code hash
, CKB uses the data in the found dep cell as the script to run. - If no dep cell has data hash matching specified
code hash
, CKB results in a validation error.
The upgradability problem, in fact lies in the way we test for dep cells. Right now we are testing against data hash, and when a script is upgraded, a different hash will be generated, the matching would fail. That brings a question: can we test for dep cells using a different solution? Is there something that can stay unchanged when the script is changed? Considering the actual script to run lives in a cell, we can rephrase the question in a different way:
Is there something that can stay unchanged when a cell's data get updated?
We can use a script structure! Since lock script is typically used for signature verification, we can use a type script for this problem. A type script can stay perfectly unchanged when a cell's data get changed. Hence we added hash type
field to CKB's script structure, and modified the script resolving flow to the following:
- For each dep cell, we extract the
test hash
based onhash type
value in the script structure:- If
hash type
isdata
, the dep cell's data hash is used astest hash
- If
hash type
istype
, the dep cell's type script hash is used astest hash
- If
- CKB extracts
code hash
andhash type
values from the script structure to run - If CKB finds a dep cell whose
test hash
matches specifiedcode hash
, CKB uses the data in the found dep cell as the scrip to run. - If no dep cell has
test hash
matching specifiedcode hash
, CKB results in a validation error.
Notice the hash type
used here, is the value belonging to the script to run, not the values of scripts in a dep cell. You could perfectly have 2 inputs in a transaction, one using data
as hash type
, the other using type
as hash type
. Either one of them will use its own correct way to locate the correct cell.
This way we have totally conquered determinism property A above, but for property B there will be some subtle affects, we will discuss those in more details below.
Putting Everything Together
There's still one problem unsolved, there might still be an attack:
- A lock script L1 is stored in a cell C1 with type script T1
- Alice guards her cells via lock script L1 using
hash type
astype
, by definition, she fills her script structure'scode hash
field with the hash of type script T1 - Bob creates a different cell C2 with a always success lock script L2, and also use the same type script T1
- Now Bob can use C2 as a dep cell to unlock Alice's cell. CKB won't be able to distinguish that Alice wants to use C1, while Bob provides C2. Both C1 and C2 use T1 as type script
That teaches us a very important lesson: if you build a lock/type script and want people to leverage the upgradability property, you have to make sure to use a type script that is unique and unforgeable.
Luckily, we just solved this problem! In the above we just developed a type script that can provide a unique type script hash in CKB. Combined together, this unique type script and hash type
feature in CKB provide a way to upgrade already deployed smart contracts.
Chicken Egg Problem
You might have noticed that CKB already implements such a script, named a type ID script. But unlike the other system scripts, this one is implemented purely in Rust outside of CKB VM. I'm sure you will have this rant: you said CKB VM is flexible, don't you?
I do have made that claim in many different occasions, and in fact I've shown you above a way to implement the type ID logic in C code that could be compiled and run in CKB VM. But the problem here, is that we are at a chicken-egg situation: if we implement the type ID script, should we make it upgradable? If so, what should we put in as its type script? And if it's not upgradable, how can we make sure it's free from bugs and all the other vulnerabilities?
We thought hard on this question, and it really seems the easiest and safest solution, is to implement the type ID script via pure Rust, so we can deal with any potential situations in a more mature and experienced way.
But unlike some other chains which keep provide discounts for builtin smart contracts, we do set a very large cycle cost on the type ID script to discourage the use of the builtin script. At mainnet, it is now set as 1 million cycles. This is much much larger than the same logic implemented on CKB VM, we are hoping one day there will be fully vetted and secure type ID script that everyone will use instead of using this builtin type ID script.
Trust Issue
We do want to mention this new feature is not without its drawbacks. You should really understand how it works and the tradeoffs involved before using it. This section tries to discuss all those considerations, but I do want to warn you it might still miss something, and you should use your own judgments.
Determinism Property A
The type ID solution does provide a way to solve property A: when a bug is discovered, it's possible to fix the smart contracts without affecting existing cells using the same smart contracts. But it does require certain considerations:
Ownership
With a type ID solution, people might be able to steal your coin by hacking the cell containing smart contracts you use. Since typical cells are guarded by only one or more signatures, some human errors could cause major problems. Fundamentally, it's a tradeoff situation:
- For the truly paranoid ones, they might want to stick to the old way of referencing a script via its own hash, not the type script hash of the containing cell. The "code is law" principle is fully enforced here. You will know exactly what can be used to unlock your cell, and it won't change anytime in the future.
- For people willing to sacrifice a little bit, they can gain the ability to upgrade existing smart contracts for different benefits. But please do make sure to fully understand the lock script of the cell containing the script you use.
For example, if you look at the deployed system scripts in CKB's mainnet, they all use type ID setup, but they all have their lock script's code hash set to all zeros, meaning no one gets to change the lock script easily via a signature. We want to make sure there's a need to change the default lock script, it will be via a fork agreed upon by the whole community, after all IMHO it really is the whole Nervos community that owns CKB :P
There is some initial ideas that you can use to guard the script contained in a cell. For example, in addition to type ID logic, the type script can also contains other logics to validate the actual script contained in its cell. For example, formal analysis methods might be used, or certain test cases can be provided in which the type script run the actual script against. Only when the analysis or the test cases pass, will one be able to change script contained in a cell. Though I must say this is just some priliminary ideas, much research and development work needs to be done to make this a reality.