Scripting Bitcoin Transactions
Created On 15. May 2021
Updated: 2021-06-06 01:35:22.465170000 +0000
Created By: acidghost
Bitcoin's scripting transaction language is called Script. It allows to get the "programmable money" and is used to lock and unlock a UTXO. Script is a very simple language that for the sake of security is not Turing complete and is stateless. Turing incompleteness prevents logic attacks such as infinite loops that could cause DOS (denial-of-service) attacks within a script against the Bitcoin network. By being stateless, a transaction will be the same when verified on any network, therefore being a valid transaction for everyone.
In Bitcoin applications, the locking script is a spending condition on an output. It will usually appear as scriptPubKey
(other notations are scriptPubKey
, witness script
or cryptographic puzzle
). This is because it usually contains a public key or Bitcoin address, however the scripting technologies are beyond just only that. On the other hand, the unlocking script will appear usually as scriptSig
(other notion is witness
).
Bitcoin's scripting language is called a stack-based language since it uses a data structure called a stack. Now, being familiar a bit with Intro to Reverse Engineering and some assembly will help understand better how the data behaves on the stack
Script has specific operators that moves the data around the stack. They are derived from push
and pop
. Some might like to joke about programming languages on drugs, but I will keep it decent here :octocat: For example an OPP_ADD
will pop
two items, add them and then push
them. Another example with OP_EQUAL
, which evaluates a condition. It will pop
two items from the stack and push
TRUE
(1) if they are equal or FALSE
(0) if they are not. This is how a Bitcoin transaction signifies if a transaction is valid or not.
Let's check an example how a script executes some simple math:
3 10 OP_ADD 2 OP_SUB 1 OP_ADD 6 OP_EQUAL
The math for this is 3 + 10 and 2 - 1 + 6. When the script will validate this expression, the value FALSE will be pushed onto the stack.
For the script to execute successfully, the spending condition has to be always TRUE. Firstly, the unlocking script has to execute successfully and afterwards the main stack will be copied to the locking script.
Pay-to-Public-KeyHash (P2PKH)
When a Bitcoin transaction is processed, it locks the spend outputs with a P2PKH script. This script will lock the output to a public key hash, that is the Bitcoin address. It will be represented in hex, as opposed to the Base58Check encoding, that is used in the public key. The address gets unlocked and starts spending when it presents a public key created by the private key. The transaction output of a locking script will look similar to:
OP_DUP OP_HASH160 <receiver public key hash> OP_EQUALVERIFY OP CHECKSIG
The script is created when it is satisfied by the unlocking script such as:
<receiver signature> <receiver public key>
then the two scripts get joined together and must be set to TRUE with a valid signature from receiver's private key that corresponds to the public key hash from the following combination:
<receiver signature> <receiver public key> OP_DUP OP_HASH160
<receiver public key hash> OP_EQUALVERIFY OP_CHECKSIG
Multisignature
Multisignature scripts contain more N public keys where a condition of M signatures must be fulfilled. An example is:
0 <Signature A> <Signature B> 2 <Public Key A> <Public Key B> <Public Key C> CHECKMULTISIG
Note: the 0 at the beginning is due to a bug in CHECKMULTISIG
that pops an extra value from the stack. The extra value must be present for it to work and pushing 0 is a workaround to the bug.
Pay-to-Script-Hash (P2SH)
P2SH simplifies the use of complex transaction scripts. One problem in multisignature scripts is that due to multiple public keys in the transactions they can become very long. Due to large size, this would increase the fees of the transactions. Also it would be hard to deal with, because each of the receivers would have to understand how it works and set up specific wallets that support it. With P2SH the size of the script is reduced with a 20-byte cryptographic hash that replaces the public keys. It is created as result of applying the SHA256 algorithm and then the RIPEMD160 against the public keys. The redeem script is a separate script that is present in the P2SH transactions and is constructed similarly to the multisignature but in the transaction itself won't contain any signatures. It would be similar to:
2 <Public Key A> <Public Key B> <Public Key C> CHECKMULTISIG
The locking script would result from the 20 bytes of the redeem scripts and would look as following:
HASH160 <P2SH> EQUAL
The signatures would be present after in the unlocking script:
<Signature A> <Signature B> <redeem script>
The P2SH transaction script will be:
2 <Public Key A> <Public Key B> <Public Key C> CHECKMULTISIG
HASH160 <P2SH> EQUAL
<Signature A> <Signature B> <redeem script>
The P2SH addresses will start with a 3 and are Base58Check encodings of the 20-byte hash script. Wallet addresses that start with 1 are Base58Check encodings of the 20-byte hash of public key.
Timelocks
Timelocks allow to the transactions to be spent after a certain point in time. This is implemented with the nLocktime
field. Other UTXO level timelocks that were introduced later are called CHECKLOCLTIMEVERIFY
and CHECKSEQUENCEVERIFY
. Timelocks allow Bitcoin to extend its dimensions into multi-step smart contracts.
nLocktime
existed since the beginning and it indicates the immediate propagation and execution. If a transaction is transmitted to the network before the specific nLocktime
, it will be rejected by the first node, which will make it invalid.
CHECKLOCLTIMEVERIFY
is calculated based on the output rate and not transaction, as the nLocktime
. This is possible with the an opcode that is added in the redeem script of an output. For example a locked transaction for 2 months will look like this:
<now + 2 months> CHECKCLOCKTIMEVERIFY DROP DUP HASH160 <sender's public key hash> EQUALVERIFY CHECKSIG
Note: EQUALVERIFY
will not leave anything on the stack.
DROP
signifies that if the CHECKCLOCKTIMEVERIFY
is satisfied, the preceding time parameter on the stack might needed to be dropped. The standard for this is defined in BIP-65, see https://github.com/bitcoin/bips/blob/master/bip-0065.mediawiki.
CHECKSEQUENCEVERIFY
are script-level timelocks that use relative timelocks, unlike nLocktime
and CHECKCLOCKTIMEVERIFY
that are absolute in time. Relative timelocks start counting when the UTXO is recorded in the blockchain. They are dependent on the confirmations of a previous transaction and the time is counted relative to that. They can be set on each input of a transaction by setting the nSequence
field in each input. Relative timelocks are imposed as a specification of BIP-68 and CHECKSEQUENCEVERIFY
in BIP-112.
Median-Time-Past is a change of approach how the time is calculated for both absolute and relative timelocks. Since Bitcoin is a decentralized network, every user has their own perspective of time. Network latency is factored with each node and it takes about 10 minutes for Bitcoin to reach the consensus of how the ledger existed in the past. Due to the time difference of all users, the miners can lie about the time in a block to earn extra fees by including timelocked transactions that are not yet mature. Median-Time-Past is calculated from the past 11 blocks and finding the median. This way no miner can influence the timestamp of any block. It changes the implementations of nLocktime
, CHECKCLOCKTIMEVERIFY
, nSequence
, CHECKSEQUENCEVERIFY
and is defined in BIP-113.
Defending Timelocks Against Attacks
One target of miners is trying to remine the last block and increase their fees. This can occur by choosing any valid transaction, where logically the target would be the ones with highest fees and rewritting them into the past block. This kind of attack is known as free snipping. To prevent this, the Bitcoin Core uses nLockTime
to limit the transactions to the next block. In such case when an attack would take place, the miners won't be able to pull high fees from the mempool, since the transactions will be timelocked to the next block, so when remining the past block, only the valid transactions that were valid at that time will be counted. This is achieved by setting the nLocktime
for all new transaction to < current block # + 1> and setting the nSequence
on all inputs to 0xFFFFFFFE to enable nLocktime
.
Flow Control
While Script is a stack-based Turing incomplete language, it still uses the maximum capabilities a programming language can do with IF
, ELSE
, ENDIF
and NOTIF
+ BOOLAND
, BOOLOR
and NOT
opcodes. While being a stack language, script will execute all the flow control backward. In practice, it is important to remember that the condition will come before the IF
, and not after as in other procedural languages. Control flow scripts are often used to construct the redeem script as below:
IF
< Pubkey A > CHECKSIG
ELSE
< Pubkey B > CHECKSIG
ENDIF
The condition for such script will be offered in the unlocking script. The execution path will be chosen upon redeeming the script with a signature for example < Sig A> 1
. The 1 at the end will signify that the condition is set to TRUE. Relatively for Pubkey B to redeem this, it will have to choose the FALSE IF cause with < Pubkey B > 0
.
One Last Example
IF
IF
2
ELSE
< 30 days > CHECKSEQUENCEVERIFY DROP
< Pubkey D > CHECKSIGVERIFY
1
ENDIF
< Pubkey A > < Pubkey B > < Pubkey C > 3 CHECKMULTISIG
ELSE
< 90 days > CHECKSEQUENCEVERIFY DROP
< Pubkey D > CHECKSIG
ENDIF
This script implements 3 execution paths. In the first execution path which consists of line 3 and 9, there is a simple 2 of 3 multisig with 3 partners. The second can be used only after 30 days after the creation of UTXO. In the third path, the funds can be spent after 90 days. For each of these, certain conditions have to be met. Can you figure the unlocking scripts for each of them?
References
Mastering Bitcoin - Andreas M. Antonopoulos
Section: Blockchain
Back