Struct Field Hash function with other Fields of the same Struct Set

I'm new to GoLang and am starting with trying to build a simple blockchain. I am having trouble creating a hash of the blocks. Can anyone help me with how I could pass the other fields of the struct set into the Hash() function within the same struct, or if it needs to be outside of the stuct somehow, or if it's even possible...

Block Struct

type Block struct {
  Index int
  PrevHash string
  Txs []Tx
  Timestamp int64
  Hash string
}

Set Struct Example

Block{
  Index: 0,
  PrevHash: "Genesis",
  Txs: []Tx{},
  Timestamp: time.Now().Unix(),
  Hash: Hash(/* How do I pass the other fields data here... */), 
}

My Hash Function

func Hash(text string) string {
  hash := md5.Sum([]byte(text))
  return hex.EncodeToString(hash[:])
}

My Imports (if helpful)

import (
  "crypto/md5"
  "encoding/hex"
  "fmt"
  "time"
)

CodePudding user response：

There's a lot of ways you can do this, but seeing as you're looking for a simple way to do this, you could just serialise the data, hash that and assign. The easiest way to do this would be to marshal your Block type, hash the result, and assign that to the Hash field.

Personally, I prefer it when this is made more explicit by splitting out the data that makes up the hash, and embed this type into the block type itself, but that really is up to you. Be advised that json marshalling maps may not be deterministic, so depending on what's in your Tx type, you may need some more work there.

Anyway, with embedded types, it'd look like this:

// you'll rarely interact with this type directly, never outside of hashing
type InBlock struct {
    Index     int    `json:"index"`
    PrevHash  string `json:"PrevHash"`
    Txs       []Tx   `json:"txs"`
    Timestamp int64  `json:"timestamp"`
}

// almost identical in to the existing block type
type Block struct {
    InBlock // embed the block fields
    Hash      string
}

Now, the hashing function can be turned into a receiver function on the Block type itself:

// CalculateHash will compute the hash, set it on the Block field, returns an error if we can't serialise the hash data
func (b *Block) CalculateHash() error {
    data, err := json.Marshal(b.InBlock) // marshal the InBlock data
    if err != nil {
        return err
    }
    hash := md5.Sum(data)
    b.Hash = hex.EncodeToString(hash[:])
    return nil
}

Now the only real difference is how you initialise your Block type:

block := Block{
    InBlock: InBlock{
        Index:     0,
        PrevHash:  "Genesis",
        Txs:       []Tx{},
        Timestamp: time.Now().Unix(),
    },
    Hash: "", // can be omitted
}
if err := block.CalculateHash(); err != nil {
    panic("something went wrong: "   err.Error())
}
// block now has the hash set

To access fields on your block variable, you don't need to specify InBlock, as the Block type doesn't have any fields with a name that mask the fields of the type it embeds, so this works:

txs := block.Txs
// just as well as this
txs := block.InBlock.Txs

Without embedding types, it would end up looking like this:

type Block struct {
    Index     int    `json:"index"`
    PrevHash  string `json:"PrevHash"`
    Txs       []Tx   `json:"txs"`
    Timestamp int64  `json:"timestamp"`
    Hash      string `json:"-"` // exclude from JSON mashalling
}

Then the hash stuff looks like this:

func (b *Block) CalculateHash() error {
    data, err := json.Marshal(b)
    if err != nil {
        return err
    }
    hash := md5.Sum(data)
    b.Hash = hex.EncodeToString(hash[:])
    return nil
}

Doing things this way, the underlying Block type can be used as you are doing right now already. The downside, at least in my opinion, is that debugging/dumping data in a human readable format is a bit annoying, because the hash is never included in a JSON dump, because of the json:"-" tag. You could work around that by only including the Hash field in the JSON output if it is set, but that would really open the door to weird bugs where hashes don't get set properly.

About the map comment

So iterating over maps is non-deterministic in golang. Determinism, as you probably know, is very important in blockchain applications, and maps are very commonly used data structures in general. When dealing with them in situations where you can have several nodes processing the same workload, it's absolutely crucial that each one of the nodes produces the same hash (obviously, provided they do the same work). Let's say you had decided to define your block type, for whatever reason as having Txs as a map by ID (so Txs map[uint64]Tx), in this case it wouldn't be guaranteed that the JSON output is the same on all nodes. If that were the case, you'd need to marshal/unmarshal the data in a way that addresses this problem:

// a new type that you'll only use in custom marshalling
// Txs is a slice here, using a wrapper type to preserve ID
type blockJSON struct {
    Index     int    `json:"index"`
    PrevHash  string `json:"PrevHash"`
    Txs       []TxID `json:"txs"`
    Timestamp int64  `json:"timestamp"`
    Hash      string `json:"-"`
}

// TxID is a type that preserves both Tx and ID data
// Tx is a pointer to prevent copying the data later on
type TxID struct {
    Tx *Tx    `json:"tx"`
    ID uint64 `json:"id"`
}

// not the json tags are gone
type Block struct {
    Index     int
    PrevHash  string
    Txs       map[uint64]Tx // as a map
    Timestamp int64
    Hash      string
}

func (b Block) MarshalJSON() ([]byte, error) {
    cpy := blockJSON{
        Index:     b.Index,
        PrevHash:  b.PrevHash,
        Txs:       make([]TxID, 0, len(b.Txs)), // allocate slice
        Timestamp: b.Timestamp,
    }
    keys := make([]uint64, 0, len(b.Txs)) // slice of keys
    for k := range b.Txs {
        keys = append(keys, k) // add keys to the slice
    }
    // now sort the slice. I prefer Stable, but for int keys Sort
    // should work just fine
    sort.SliceStable(keys, func(i, j int) bool {
        return keys[i] < keys[j]
    }
    // now we can iterate over our sorted slice and append to the Txs slice ensuring the order is deterministic
    for _, k := range keys {
        cpy.Txs = append(cpy.Txs, TxID{
            Tx: &b.Txs[k],
            ID: k,
        })
    }
    // now we've copied over all the data, we can marshal it:
    return json.Marshal(cpy)
}

The same must be done for the unmarshalling, because the serialised data is no longer compatible with our original Block type:

func (b *Block) UnmarshalJSON(data []byte) error {
    wrapper := blockJSON{} // the intermediary type
    if err := json.Unmarshal(data, &wrapper); err != nil {
        return err
    }
    // copy over fields again
    b.Index = wrapper.Index
    b.PrevHash = wrapper.PrevHash
    b.Timestamp = wrapper.Timestamp
    b.Txs = make(map[uint64]Tx, len(wrapper.Txs)) // allocate map
    for _, tx := range wrapper.Txs {
        b.Txs[tx.ID] = *tx.Tx // copy over values to build the map
    }
    return nil
}

Instead of copying over field-by-field, especially because we don't really care whether the Hash field retains its value, you can just reassign the entire Block variable:

func (b *Block) UnmarshalJSON(data []byte) error {
    wrapper := blockJSON{} // the intermediary type
    if err := json.Unmarshal(data, &wrapper); err != nil {
        return err
    }
    *b = Block{
        Index:     wrapper.Index,
        PrevHash:  wrapper.PrevHash,
        Txs:       make(map[uint64]Tx, len(wrapper.Txs)),
        Timestamp: wrapper.Timestamp,
    }
    for _, tx := range wrapper.Txs {
        b.Txs[tx.ID] = *tx.Tx // populate map
    }
    return nil
}

But yeah, as you can probably tell: avoid maps in types that you want to hash, or implement different methods to get the hash in a more reliable way